FAQ

Quick answers to common questions about NEO. Contact support: support@heyneo.so.

Quick Navigation

Getting Started – Setup guide and first steps
Data & File Handling – Formats, uploads, and storage
Platform vs Extension – Choosing your deployment mode
Task Submission – Writing effective tasks
Technical & Security – Architecture, privacy, and security

Getting Started

NEO supports multiple ML domains:

Tabular ML – Classification, regression, clustering, time series
Computer Vision – Image classification, object detection, OCR
NLP – Text classification, sentiment analysis, NER, summarization
Audio & Speech – Speech recognition, audio classification
LLM Fine-tuning – Instruction tuning, LoRA, domain adaptation
Anomaly Detection – Outlier detection, fraud detection

What is NEO and how does it work?

NEO is an autonomous ML agent automating the full pipeline from data to deployment.

Workflow:

Describe your ML task in natural language
Provide data (upload, URL, or cloud)
NEO analyzes data and selects models
Receive production-ready artifacts with documentation

Example Task:

Build a customer churn prediction model using customer_data.csv. Optimize for recall since missing churners is costly.

NEO handles preprocessing, feature engineering, training, evaluation, and artifact generation automatically.

Do I need ML expertise?

No. NEO is designed for all skill levels.

Beginners	ML Practitioners
Use task templates	Specify models/constraints
Step-by-step explanations	Access detailed reports
Start simple, progress	Customize deployments & evaluation
Describe your business goal	Modify generated code in VS Code

The key is clear goal description, not prior ML knowledge.

How long does a typical project take?

Task Type	Duration
Simple tabular models	15-30 min
Image classification	30-60 min
Large datasets (>1GB)	1-3 hrs
NLP fine-tuning	2-6 hrs
Custom deep learning	4-12 hrs

Tip: Start with a small sample, then scale.

Data & File Handling

Supported Formats

Format	Use Case	Platform	VS Code
CSV	Tabular/time series	✅	✅
Parquet	Large datasets	✅	✅
JSON	Structured/log data	✅	✅
Images	CV tasks	✅ (50MB)	✅
Audio	Speech/music	✅ (50MB)	✅

How do I handle large datasets?

Approach:

Platform – Use cloud storage (S3/GCS/Azure)
Convert to Parquet – Faster processing
Test first – Use 10% sample

File limits:

Platform upload: 50MB/file
Platform cloud storage: unlimited
VS Code: unlimited local files

Does NEO handle missing data?

Yes. Automatic detection and imputation:

Data Type	Strategy
Numerical	Mean, median, predictive
Categorical	Mode, “Unknown”
Time Series	Forward fill, interpolation
Advanced	ML-based imputation

Platform Mode vs VS Code Extension

Feature	Platform	VS Code Extension
Access	Browser	VS Code editor
Setup	Quick, no install	Install once
Data	Upload ≤50MB or cloud	Local + cloud
Artifacts	Downloadable	Generated in workspace
Code Editing	View only	Full IDE + Git
Best For	Prototyping, collaboration	Customization, local dev, large datasets

Which mode should I use?

Platform Mode: Quick results, no setup, collaborative testing

VS Code Extension: Edit code, work with large local files, full IDE features, version control

Task Submission

Good	Poor
Predict customer churn using customer_data.csv (50k rows). Optimize for precision-recall balance.	Do some ML with my data

How do I write an effective task?

Include:

Goal – What to predict/classify
Data – Files, size, key columns
Metrics – How to measure success
Context – Business relevance

What metrics should I use?

Task	Metric
Regression	RMSE, MAE, R²
Classification	Accuracy, F1, AUC-ROC
Time Series	MAPE, SMAPE, directional
Ranking	NDCG, MAP, precision@k

Map metrics to business goals:

Minimize false positives → precision
Catch all fraud → recall
Balance speed & accuracy → F1-score

Technical & Security

Feature	Description
Data Encryption	At rest & in transit
No Sharing	Never shared with third parties
Complete Control	Delete or export anytime

Is my data secure?

Platform: Cloud encrypted, deletion on request, no sharing

VS Code: Local files never leave machine, secure cloud access via credentials, full control

Can I see generated code?

Yes, NEO provides:

Preprocessing & modeling code
Step-by-step notebooks
Deployment scripts
Documentation & methodology

What if model performance is low?

Improvement:

Improve data quality/features
Adjust metrics/constraints
Provide domain knowledge
Request specific approaches (ensembles, deep learning)

Focus on practical business impact, not perfect accuracy.

Need More Help?

Documentation – Full docs
Use Cases – Real-world examples
Contact Support – Direct help