How to Import CSV Files into Neo4j: A Step-by-Step Tutorial
03.10.2024
- Introduction to Importing CSV into Neo4j
- Preparing Your CSV File for Import
- Setting Up Neo4j for CSV Import
- Importing CSV Using Cypher Commands
- Troubleshooting Common Import Issues
Before importing a CSV file into Neo4j, it is crucial to ensure that the data is properly formatted and structured. A well-prepared file minimizes errors and improves query performance.

1. Formatting Requirements
- Use a **comma (`,`)** or **semicolon (`;`)** as the delimiter.
- Ensure that the file is encoded in UTF-8 to avoid character issues.
- Column names should be clear, without spaces (use underscores instead).
2. Structuring Your Data
Your CSV file should be structured in a way that represents nodes and relationships separately:
“A graph database is most effective when data is normalized and stored in a structured manner.”
Progress tracker
Example 1: Nodes CSV (Users)
user_id,name,age,email
1,Alice,30,alice@email.com
2,Bob,25,bob@email.com
Example 2: Relationships CSV (Friendship)
user_id_1,user_id_2,since
1,2,2022
3. Handling Special Cases
- Null values: Use an empty string or `NULL` to represent missing data.
- Escape characters: If a field contains commas, enclose it in double quotes (`” “`).
- Dates: Ensure consistent formats like `YYYY-MM-DD`.
Once your CSV file is formatted correctly, you are ready to import it into **Neo4j** without issues. Next, we’ll configure Neo4j for a smooth import process.
Before importing a CSV file into Neo4j, you must ensure the file is properly structured. A well-organized dataset reduces errors and improves query efficiency.
1. Formatting Your CSV File
- Use a comma (`,`) or semicolon (`;`) as the delimiter.
- Ensure the file is encoded in UTF-8 to prevent character issues.
- Column names should be simple, without spaces (use underscores instead, e.g.,
first_name
).
2. Structuring Data for Neo4j
Neo4j requires separate CSV files for nodes and relationships. Each file must follow a specific structure.
“Graph databases perform best when data is structured according to relationships rather than traditional table-based formats.”
Example: Nodes CSV (Users)
user_id,name,age,email
1,Alice,30,alice@email.com
2,Bob,25,bob@email.com
Example: Relationships CSV (Friendship)
user_id_1,user_id_2,since
1,2,2022
3. Handling Common Issues
- Missing values: Use an empty string or `NULL` where data is unavailable.
- Escape characters: If a field contains commas, enclose it in double quotes (`” “`).
- Date formats: Use a consistent format, such as
YYYY-MM-DD
, to avoid errors.
4. Verifying Your CSV File
Before importing, review the file to ensure data integrity. You can open it in a text editor or spreadsheet tool.
With your CSV properly formatted, you’re ready to proceed with importing it into Neo4j efficiently.
Before you can import your CSV file into Neo4j, you need to ensure the database is properly set up. This includes configuring Neo4j settings, placing your CSV files in the correct directory, and verifying database access.
1. Configuring Neo4j for CSV Import
By default, Neo4j restricts file imports for security reasons. To enable imports, you must modify the Neo4j configuration file.
Steps to Enable File Import:
- Locate the
neo4j.conf
file in theconf/
directory of your Neo4j installation. - Find the following line:
#dbms.directories.import=import
- Uncomment and modify it to:
dbms.directories.import=import
- Save the file and restart Neo4j.
2. Placing CSV Files in the Import Directory
Neo4j requires CSV files to be in the import
directory of your Neo4j installation. To move your files:
mv your_data.csv /path/to/neo4j/import/
3. Verifying Database Access
Ensure Neo4j is running and accessible before attempting the import. Open Neo4j Browser and run:
:server status
If Neo4j is running, proceed with the import. Otherwise, restart the service using:
neo4j start
4. Testing CSV File Readability
Before full import, test whether Neo4j can read your CSV file:
LOAD CSV WITH HEADERS FROM 'file:///your_data.csv' AS row RETURN row LIMIT 5;
- If the query returns data: The file is correctly placed, and you can proceed with import.
- If an error occurs: Check file placement and permissions.
Once Neo4j is properly set up, you’re ready to begin the actual data import process.
Once your CSV file is prepared and Neo4j is properly configured, you can import the data using Cypher commands. Cypher provides a powerful and flexible way to load structured data into your Neo4j database.
1. Understanding the LOAD CSV
Command
The LOAD CSV
command in Cypher is used to read data from a CSV file and process it into nodes and relationships.
Basic Syntax:
LOAD CSV WITH HEADERS FROM 'file:///your_data.csv' AS row RETURN row LIMIT 5;
This command checks if the CSV file is accessible and displays the first five rows.
2. Creating Nodes from CSV Data
To create nodes in Neo4j from your CSV file, use:
LOAD CSV WITH HEADERS FROM 'file:///your_data.csv' AS row CREATE (:Person {name: row.name, age: toInteger(row.age), city: row.city});
WITH HEADERS
ensures the first row is treated as column headers.CREATE
adds a node of typePerson
with properties.toInteger()
converts text values into numbers where necessary.
3. Creating Relationships from CSV Data
If your CSV file contains relationships, you can import them using the MERGE
command:
LOAD CSV WITH HEADERS FROM 'file:///relationships.csv' AS row MATCH (a:Person {name: row.person1}), (b:Person {name: row.person2}) CREATE (a)-[:FRIENDS_WITH]->(b);
MATCH
finds existing nodes.CREATE
establishes a relationship between them.
4. Handling Large Imports Efficiently
For large datasets, use USING PERIODIC COMMIT
to improve performance:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM 'file:///large_data.csv' AS row CREATE (:Person {name: row.name, age: toInteger(row.age)});
This processes data in batches of 1,000 rows, reducing memory load.
5. Verifying Data After Import
To check imported data, run:
MATCH (p:Person) RETURN p LIMIT 10;
This query retrieves the first ten Person
nodes for validation.
By following these steps, you can efficiently import CSV files into Neo4j using Cypher commands, ensuring structured and connected data within your graph database.
When importing CSV files into Neo4j, you may encounter various errors that prevent a smooth data import. Understanding common issues and their solutions can save time and frustration. Below are some of the most frequent problems and how to resolve them.
1. File Not Found Error
If Neo4j can’t locate your CSV file, check the following:
- Ensure the file is in the
import
directory. By default, Neo4j only reads files fromneo4j/import
. - Use the correct file path. The command should follow this format:
LOAD CSV WITH HEADERS FROM 'file:///your_data.csv' AS row RETURN row LIMIT 5;
If the file is outside the import folder, modify neo4j.conf
to allow access:
dbms.security.allow_csv_import_from_file_urls=true
2. Incorrect Data Formatting
Neo4j expects properly formatted CSV files. Common mistakes include:
- Missing headers: Ensure the first row contains column names.
- Extra spaces: Use
TRIM()
to clean up whitespace:
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row CREATE (:Person {name: TRIM(row.name), age: toInteger(row.age)});
3. Type Conversion Errors
If numbers or dates fail to import correctly, check the data types:
- Convert string to integer:
toInteger(row.age)
- Convert string to float:
toFloat(row.salary)
- Convert string to date:
date(row.birthday)
4. Duplicate Data Issues
To prevent duplicate nodes, use MERGE
instead of CREATE
:
MERGE (p:Person {name: row.name}) ON CREATE SET p.age = toInteger(row.age);
5. Memory and Performance Problems
For large imports, Neo4j may slow down or crash. Optimize performance by:
- Using batch commits: Load data in chunks with:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM 'file:///large_data.csv' AS row CREATE (:Person {name: row.name, age: toInteger(row.age)});
- Indexing nodes: Speed up queries by indexing frequently used properties:
CREATE INDEX FOR (p:Person) ON (p.name);
6. Encoding Issues
If you see strange characters, your file may have encoding issues. Convert it to UTF-8:
iconv -f ISO-8859-1 -t UTF-8 input.csv -o output.csv
7. Verifying Data After Import
To confirm data is correctly loaded, run:
MATCH (p:Person) RETURN p LIMIT 10;
By addressing these common import issues, you can ensure a smooth and error-free data import process in Neo4j.