AWS Certified Data Analytics – Specialty DAS-C01 – Question054

A company stores Apache Parquet-formatted files in Amazon S3. The company uses an AWS Glue Data Catalog to store the table metadata and Amazon Athena to query and analyze the data. The tables have a large number of partitions. The queries are only run on small subsets of data in the table. A data analyst adds new time partitions into the table as new data arrives. The data analyst has been asked to reduce the query runtime.
Which solution will provide the MOST reduction in the query runtime?

A.
Convert the Parquet files to the .csv file format. Then attempt to query the data again.
B. Convert the Parquet files to the Apache ORC file format. Then attempt to query the data again.
C. Use partition projection to speed up the processing of the partitioned table.
D. Add more partitions to be used over the table. Then filter over two partitions and put all columns in the WHERE clause.

Correct Answer: A