Data format requirements
GraphHelix accepts CSV, Excel, SPSS, and Stata files. This page covers the specifics of each format and tips for preparing your data.
Supported formats
| Format | Extensions | Parsing | Special features |
|---|---|---|---|
| CSV | .csv | Client-side (Papa Parse) | Auto-detects comma, tab, semicolon delimiters |
| Excel | .xlsx, .xls | Client-side | First sheet imported by default |
| SPSS | .sav | Server-side (pyreadstat) | Variable labels, value labels, metadata summary |
| Stata | .dta | Server-side | Variable labels as column headers (opt-in checkbox, on by default) |
CSV files
CSV is the most universal format and the easiest to prepare. GraphHelix auto-detects the delimiter (comma, tab, or semicolon).
Requirements
- First row must be column headers
- One observation per row, one variable per column
- Consistent data types within each column (don't mix numbers and text in the same column)
- UTF-8 encoding recommended (most spreadsheet software exports this by default)
Tips
- Use short, descriptive column names without special characters (e.g.,
blood_pressurerather thanBlood Pressure (mmHg) #1) - Missing values can be empty cells or
NA— GraphHelix handles both - If your CSV has metadata rows above the header (common with instrument exports), remove them before importing
Excel files
GraphHelix imports the first sheet of .xlsx and .xls files. The same column structure rules apply as for CSV.
Common issues
- Merged cells — unmerge all cells before importing; merged cells cause misalignment
- Multiple header rows — only the first row is treated as column names
- Formulas — calculated values are imported correctly; formulas themselves are not
- Multiple sheets — only the first sheet is imported; move your data to Sheet 1 if needed
SPSS files (.sav)
SPSS files are parsed server-side and preserve the rich metadata that SPSS stores with your data.
What's preserved
- Variable labels — descriptive labels appear alongside variable names
- Value labels — coded values (1 = "Male", 2 = "Female") are shown in the import preview
- Variable count and row count — displayed in the import summary
Switching from SPSS? Your existing .sav files work in GraphHelix without any conversion. All variable labels and value labels are preserved automatically.
Stata files (.dta)
Stata files are parsed server-side with full support for variable labels.
Variable labels
When importing a Stata file, you'll see a checkbox: "Use variable labels as column headers" (checked by default). This replaces short variable names (e.g., bp_sys) with their descriptive labels (e.g., "Systolic Blood Pressure"). A preview table shows the first 5 variable-to-label mappings so you can decide before importing.
Switching from Stata? Your .dta files work directly — no need to export to CSV first. Variable labels and value labels are preserved.
Column type detection
GraphHelix automatically infers column types during import:
| Detected type | Examples | Used for |
|---|---|---|
| Number | 23.4, 100, -5.67 | Statistical tests, continuous outcomes, predictors |
| Text | "Treatment", "Control", "Male" | Grouping variables, categorical factors |
| Date | 2026-03-16, 03/16/2026 | Time-to-event analysis, longitudinal data |
You can override any column type during import if the auto-detection doesn't match your data.
Export options
After running analyses, you can export results in several formats:
- APA string — one-click copy of formatted result text
- Markdown — full analysis report with APA citation, summary, and AI interpretation
- JSON — complete results object for programmatic use
- Figures — PNG (150/300/600 DPI) or SVG with journal presets for Nature, Science, PLOS ONE, JAMA, and Cell
- Audit trail — Markdown table of all analyses for journal supplementary materials
Ready to import your data? Join the beta and try GraphHelix.
Join the beta