September 11, 2024

Converting CSV to JSON in Python

Converting CSV (Comma-Separated Values) files to JSON (JavaScript Object Notation) is a common task in data processing. Python provides various libraries to handle these formats, making it straightforward to convert between them. Below, we’ll walk through the process of converting a CSV file to a JSON file using Python.

Using the csv and json Modules

Python’s built-in csv module provides functionality to read and write CSV files, while the json module allows you to work with JSON data. Together, these modules can be used to convert a CSV file into a JSON file.

Example Code

Suppose you have a CSV file named data.csv with the following content:

name,age,city
John,23,New York
Anna,30,San Francisco
Mike,22,Chicago
    

Here’s how you can convert this CSV file into a JSON file:

import csv
import json

# Path to the input CSV file
csv_file_path = 'data.csv'

# Path to the output JSON file
json_file_path = 'data.json'

# Read the CSV file and convert it to a list of dictionaries
data = []
with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        data.append(row)

# Write the list of dictionaries to a JSON file
with open(json_file_path, mode='w', encoding='utf-8') as json_file:
    json.dump(data, json_file, indent=4)

print(f"CSV file '{csv_file_path}' has been converted to JSON file '{json_file_path}'.")
    

Output JSON File

The output JSON file named data.json will look like this:

[
    {
        "name": "John",
        "age": "23",
        "city": "New York"
    },
    {
        "name": "Anna",
        "age": "30",
        "city": "San Francisco"
    },
    {
        "name": "Mike",
        "age": "22",
        "city": "Chicago"
    }
]
    

Explanation

Here’s a breakdown of the steps:

  • Reading the CSV File: The csv.DictReader class reads each row of the CSV file as a dictionary, where the keys are the column headers and the values are the data in each row.
  • Converting to JSON: The list of dictionaries is passed to the json.dump() function, which writes the data to a JSON file. The indent=4 parameter is used to format the JSON output with an indentation level of 4 spaces, making it more readable.

Handling CSV Files with Different Delimiters

If your CSV file uses a different delimiter, such as a semicolon (;) instead of a comma, you can specify the delimiter when creating the DictReader object:

with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=';')
    for row in csv_reader:
        data.append(row)
    

Handling Special Characters and Encoding

If your CSV file contains special characters or uses a different encoding, you can specify the encoding when opening the file. For example, to handle UTF-8 encoded files:

with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        data.append(row)
    

Conclusion

Converting CSV files to JSON in Python is a straightforward process using the csv and json modules. This method allows you to efficiently handle data transformation tasks, making it easy to integrate CSV data into applications that require JSON format. With the ability to specify delimiters and encoding, you can adapt the conversion process to various types of CSV files.