Working with CSV data enclosed in quotes

Working with CSV data enclosed in quotes

Working with CSV files enclosed in Quotes CSV files occasionally have quotes around the data values intended for each column which aren’t part of the data to be analyzed. To read the CSV file properly, we can update the table properties in AWS Glue to use the OpenCSVSerDe.

  1. Go to Glue console.

  2. In the left navigation menu, click Tables under the Data Catalog section.

  3. On the Table screen, click on the raw_taxi_zone_lookup table.

  4. Click Actions, then click on Edit table.

  5. Update the Serialization lib with org.apache.hadoop.hive.serde2.OpenCSVSerde.

  6. Remove existing Serde parameters and then add the following:

    • escapeChar, enter a backslash \
    • quoteChar, enter a double quote "
    • separatorChar, enter a comma ,

    Preview table

  7. Click Save.

  8. Go to back to the Athena console.

  9. On the Query editor page, click on the actions menu icon besides raw_taxi_zone_lookup, and then click Preview table.

    Preview table

  10. Observe that all string values are no longer enclosed with .