An increasing number of companies are collaborating with AerisWeather. In turn, we continue to deliver improvements and widen the scope of our weather data offerings. As companies grow and expand, so does the amount of data they generate. Naturally, leveraging this data effectively can be challenging.
One of the most popular methods of analysis is via CSV file. The format is commonly associated with Excel spreadsheets and is popular in data analysis. The downside of CSV files is that they are not as versatile for mass distribution as other formats. This is why AerisWeather distributes data in JSON or GeoJSON file formats, which can then be converted to CSV – but the translation process can appear daunting. In this post, we will walk through the following:
Differences between JSON, GeoJSON, and CSV
Why AerisWeather uses JSON
Two approaches to convert JSON to CSV
What are the differences between JSON, GeoJSON, and CSV?
AerisWeather also exports data in the form of GeoJSON. GeoJSON is a more focused JSON format, used for encoding various geographic data structures. GeoJSON data can be found almost anywhere mapping is required. Services like Google Maps primarily use this data format to deliver specific geospatial data for all kinds of scenarios. GeoJSON generally makes use of one or multiple coordinate points in the form of different data types, typically called Features. These range from a singular coordinate point, lines containing multiple coordinate points, or even full geospatial areas enclosed by these lines.
Comma-Separated Values, also referred to as a CSV file, is a format commonly used in numerous situations. A CSV, as the name implies, contains values separated by commas in a plain text format. This file type is typically associated with Microsoft Excel - likely because Microsoft Excel and programs like it, such as Google Sheets, are programs that are created to interpret and manipulate CSV files.
Why does AerisWeather use JSON and GeoJSON over CSV?
We deliver our data in JSON or GeoJSON instead of CSV for a few primary reasons.
JSON has no difficulty displaying hierarchical data (unlike CSV)
JSON better supports hierarchical and relational data and returns filtered results more efficiently and reliably.
JSON allows for scalability and can be integrated with programs and applications with ease. CSV is not as easily scaled, and on top of this, can be more difficult to integrate.
Perhaps most importantly, JSON works extremely well with complex, large-scale datasets (like our weather datasets!), whereas CSV tends to function best with smaller datasets.
It’s worth noting that JSON/GeoJSON can have a larger file size compared to CSV files; however, this is a small price to pay to have the opportunity to deliver complete and complex datasets to clients across the globe.
Although we provide our data in JSON and GeoJSON format, we understand not all of our clients use JSON-compatible programs. The following examples will show you how to easily translate our data to a more universal CSV format.
To follow along with these examples, you will need the following:
An AerisWeather API subscription or free trial
Working knowledge of Microsoft Excel and/or Python 3.~
The following software installed:
This first method will make use of Python and pandas, a powerful Python library often used for data analysis. This method is straightforward and only takes a few steps:
Check that you’ve installed Python successfully with the following command:
If this is successful, you should see an output of something like Python 3.x.x
Open your terminal and navigate to your project folder. Make sure that the JSON file you have saved is here, and begin your Python instance using this command:
Repeat this process for pandas by importing the library using this command:
import pandas as pd
Check the current version of pandas with this command:
NOTE: This command contains 2 underscores, followed by “version”, finishing with 2 underscores.
Now that we have verified our Python and pandas version, begin the conversion process, changing the project name to one suitable for your project:
result = pd.read_json(‘json_name.json’, orient=’records’, typ=’frame’)
This code reads your JSON file with name ‘json_name.json’ and assigns it to a variable result.
Next, create the field names you want to convert, and insert them into an array in this format:
fields = [‘field1’, field2.subfield1, field2.subfield2,...]
The variables in the array above are generic field names. It is highly recommended to examine your JSON file and interpret which data you want to convert for analysis.
After this command, normalize your results with pandas like so:
df = pd.json.normalize(results[‘superfield’])
This variable is assigned to a pandas function reads our previous results variable, and searches inside it for ‘superfield’. Then it prepares the JSON file for conversion. Keep in mind, the value ‘superfield’ can be changed to whatever parent value you are searching for within the results variable.
Now we create a variable output, gearing our normalization function with our desired fields:
output = df[fields]
Finally, we complete the conversion process with this command:
Conversion to CSV can also be completed through Microsoft Excel. This process can be slightly more complicated but offers the ability to see your data during the conversion. For this example, we will be investigating JSON results from the AerisWeather conditions endpoint.
Open Excel and create a new Spreadsheet
Navigate to the Data tab
Select Get Data > From File > From JSON
Select your desired JSON file to import
If you want this data sheet to have a specific name, you can change this in the field on the right-hand side. Otherwise, this will default to the JSON file’s name
Click on the “List” next to “periods”
Click on “Convert to Table” in the upper-left corner, then click “Okay” in the pop-up window
Click on the double-arrow icon in the top right corner of the header cell.
Uncheck “Use original column name as prefix” unless it is desired, then click “Okay”.
Repeat this process for all headers of columns of “Record” cells. We recommend leaving “Use original column name as prefix” checked for these entries!
If only certain columns are desired, right-click the column header and click “Remove” or “Remove others” to remove all columns not selected.
Click “Close and Load” in the top-right corner.
This blog reviewed the differences between JSON, GeoJSON, and CSV file formats, the reasons why AerisWeather chooses to deliver its data in JSON format, and how to convert our JSON/GeoJSON data to a CSV format. Use this information to cut down on time and analyze large datasets more efficiently – and as always, if you have any questions, reach out - we’re here to help!