How to query nested fields in MongoDB using Presto

I'm setting up a Presto cluster which I'd like to use to query a MongoDB instance. Data in my Mongo instance has the following structure:
_id: <value>
somefield: <value>
otherfield: <value>
nesting_1: {
nested_field_1_1: <value>
nested_field_1_2: <value>
nesting_2: {
nesting_2_1: {
nested_field_2_1_1: <value>
nested_field_2_1_2: <value>
nesting_2_2: {
nested_field_2_2_1: <value>
nested_field_2_2_2: <value>
Just by plugging it, Presto correctly identifies and creates columns for the values in the top level (e.g. somefield, otherfield) and in the first nesting level -- that is, it creates a column for nesting_1, and its content is a row(nested_field_1_1 <type>, nested_field_1_2 <type>, ...), and I can query table.nesting1.nested_field_1_1.
However, fields with an extra nesting layer (e.g. nesting_2 and everything within it) are missing from the table schema. Presto's documentation for the MongoDB connector does mention that:
At startup, this connector tries guessing fields’ types, but it might not be correct for your collection. In that case, you need to modify it manually. CREATE TABLE and CREATE TABLE AS SELECT will create an entry for you.
While that seems to explain my use case, it's not very clear on how to "modify it manually" -- a CREATE TABLE statement doesn't seem appropriate, as the table is already there. The documentation also has a section on how to declare fields and their types, but it's also not very clear on how to deal with multiple nesting levels.
My question is: how do I setup Presto's MongoDB connector so that I can query fields in the third nesting layer?
Answers can assume that:
all nested fields' names are known;
there are only 3 layers;
there is no need to preserve the layered table layout (i.e. I don't mind if my resulting Presto table has all nested fields as unique columns like somefield, rather than one field with rows like nesting_1 in the above example);
extra points if the solution doesn't require me to explicitly declare the names and types of all columns in the third layer, as I have over 1500 of them -- but this is not a hard requirement.

On, the property mongodb.schema-collection can be used to describe the schema of your MongoDB collections. As described in the documentation, this property is optional and the default is _schema.
it's not very clear on how to "modify it manually" -- a CREATE TABLE statement doesn't seem appropriate, as the table is already there.
It is supposed to be created and populated automatically but what I've noticed is that it is populated until some queries are executed, and it only generates the schema for the collections that are queried.
However, there is a open bug, some fields/columns are not automatically picked up.
Also, once an entry for a collection is created/populated it won't be updated automatically, any update needs to be done manually (if the collection start to have new fields they won't be detected automatically).
To manually update the schema, the field column is just another entry in the fields array, as mentioned in the doc, it has three parts :
name Name of the column in the Presto table, it needs to match with the name of the collection field.
type Presto type of the column. Here are the available types, the ROW type can be used for nested properties.
hidden Hides the column from DESCRIBE <table name> and SELECT *. Defaults to false.
My question is: how do I setup Presto's MongoDB connector so that I can query fields in the third nesting layer?
The schema definition for a MongoDB collection like the one you posted will be containing something like:
"fields": [
"name": "_id",
"type": "ObjectId",
"hidden": true
"name": "somefield",
"type": "varchar",
"hidden": false
"name": "otherfield",
"type": "varchar",
"hidden": false
"name": "nesting_1",
"type": "row(nested_field_1_1 varchar, nested_field_1_2 bigint)",
"hidden": false
"name": "nesting_2",
"type": "row(nesting_2_1 row(nested_field_2_1_1 varchar, nested_field_2_1_2 varchar),nesting_2_2 row(nested_field_2_2_1 varchar, nested_field_2_2_2 varchar))",
"hidden": false
It can be queried using . over the columns, like:
SELECT nesting_2.nesting_2_1.nested_field_2_1_1 FROM table;

If the mongo collection being queried does not have a fixed schema, indicated in the _schema collection, Presto is not able to infer the document structure.
If you prefer,the option is to explicitly declare the schema in the connector configuration, using field mongodb.schema-collection, as described in the documentation. You can set it to a different mongo collection which stores the same values, and create this collection directly.
Nested fields can be declared using the ROW data type, which is also described in the docs and behaves like what would be a struct or dictionary in other programming languages.

You can create a collection in mongodb, for example "presto_schema" in your database and insert sample schema like this
"table" : "your_collection",
"fields" : [
"name" : "_id",
"type" : "ObjectId",
"hidden" : true
"name" : "last_name",
"type" : "varchar",
"hidden" : false
"name" : "id",
"type" : "varchar",
"hidden" : false
In your presto, add the property like this:
From now, presto will use "presto_schema" instead of your default "_schema" to query.


