Converting CSV (Comma-Separated Values) files to JSON (JavaScript Object Notation) is a common task in data processing. Python provides various libraries to handle these formats, making it straightforward to convert between them. Below, we’ll walk through the process of converting a CSV file to a JSON file using Python.
Using the csv
and json
Modules
Python’s built-in csv
module provides functionality to read and write CSV files, while the json
module allows you to work with JSON data. Together, these modules can be used to convert a CSV file into a JSON file.
Example Code
Suppose you have a CSV file named data.csv
with the following content:
name,age,city
John,23,New York
Anna,30,San Francisco
Mike,22,Chicago
Here’s how you can convert this CSV file into a JSON file:
import csv
import json
# Path to the input CSV file
csv_file_path = 'data.csv'
# Path to the output JSON file
json_file_path = 'data.json'
# Read the CSV file and convert it to a list of dictionaries
data = []
with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
data.append(row)
# Write the list of dictionaries to a JSON file
with open(json_file_path, mode='w', encoding='utf-8') as json_file:
json.dump(data, json_file, indent=4)
print(f"CSV file '{csv_file_path}' has been converted to JSON file '{json_file_path}'.")
Output JSON File
The output JSON file named data.json
will look like this:
[
{
"name": "John",
"age": "23",
"city": "New York"
},
{
"name": "Anna",
"age": "30",
"city": "San Francisco"
},
{
"name": "Mike",
"age": "22",
"city": "Chicago"
}
]
Explanation
Here’s a breakdown of the steps:
- Reading the CSV File: The
csv.DictReader
class reads each row of the CSV file as a dictionary, where the keys are the column headers and the values are the data in each row. - Converting to JSON: The list of dictionaries is passed to the
json.dump()
function, which writes the data to a JSON file. Theindent=4
parameter is used to format the JSON output with an indentation level of 4 spaces, making it more readable.
Handling CSV Files with Different Delimiters
If your CSV file uses a different delimiter, such as a semicolon (;
) instead of a comma, you can specify the delimiter when creating the DictReader
object:
with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=';')
for row in csv_reader:
data.append(row)
Handling Special Characters and Encoding
If your CSV file contains special characters or uses a different encoding, you can specify the encoding when opening the file. For example, to handle UTF-8 encoded files:
with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
data.append(row)
Conclusion
Converting CSV files to JSON in Python is a straightforward process using the csv
and json
modules. This method allows you to efficiently handle data transformation tasks, making it easy to integrate CSV data into applications that require JSON format. With the ability to specify delimiters and encoding, you can adapt the conversion process to various types of CSV files.