Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

axis mapping managment #474

Open
aborruso opened this issue Jan 5, 2025 · 4 comments
Open

axis mapping managment #474

aborruso opened this issue Jan 5, 2025 · 4 comments

Comments

@aborruso
Copy link
Contributor

aborruso commented Jan 5, 2025

Hi,
I have this sample input shapefile: test.shp.zip.

It's in EPSG:4326, whit this extent Extent: (16.555051, 40.818774) - (16.559098, 40.827021).

I need to convert it in EPSG:32633 and I run

duckdb -c "SELECT
    punti.id AS punto_id,
    ST_Transform(punti.geom, 'EPSG:4326', 'EPSG:32633') AS geom_transformed
FROM
    st_read('test.shp') punti
"

If I do it, I have these completely wrong coordinates:

┌──────────┬────────────────────────────────────────────────────┐
│ punto_id │                  geom_transformed                  │
│  int64   │                      geometry                      │
├──────────┼────────────────────────────────────────────────────┤
│        1 │ MULTIPOINT (3336062.6618564087 2022354.8226907582) │
│        2 │ MULTIPOINT (3336032.8392719002 2022378.723668074)  │
│        3 │ MULTIPOINT (3337050.297789454 2022021.549135726)   │
│        4 │ MULTIPOINT (3337052.164717525 2022022.8591812393)  │
│        5 │ MULTIPOINT (3337053.278077624 2022024.0651664643)  │
│        6 │ MULTIPOINT (3337055.4912509737 2022024.7215322666) │
│        7 │ MULTIPOINT (3337056.163306556 2022027.2695657292)  │
│        8 │ MULTIPOINT (3337058.8584177634 2022031.8507012462) │
│        9 │ MULTIPOINT (3337056.3736178074 2022035.0150414458) │
│       10 │ MULTIPOINT (3337054.791865914 2022031.6398922037)  │
├──────────┴────────────────────────────────────────────────────┤
│ 10 rows                                             2 columns │
└───────────────────────────────────────────────────────────────┘

I couldn't understand why, and then I thought there might be a problem with axis management. In this input file I have Data axis to CRS axis mapping: 2,1.

If I add always_xy := true in my ST_TRANSFORM function I have the right coordinates

duckdb -c "SELECT
    punti.id AS punto_id,
    ST_Transform(punti.geom, 'EPSG:4326', 'EPSG:32633',always_xy := true) AS geom_transformed
FROM
    st_read('test.shp') punti
"

┌──────────┬──────────────────────────────────────────────────┐
│ punto_id │                 geom_transformed                 │
│  int64   │                     geometry                     │
├──────────┼──────────────────────────────────────────────────┤
│        1 │ MULTIPOINT (631463.7475149037 4519833.59782756)  │
│        2 │ MULTIPOINT (631483.4054201827 4519809.11361988)  │
│        3 │ MULTIPOINT (631126.0160750012 4520709.224739576) │
│        4 │ MULTIPOINT (631126.703585717 4520711.154226093)  │
│        5 │ MULTIPOINT (631127.4038529369 4520712.364957238) │
│        6 │ MULTIPOINT (631127.6060917454 4520714.525490591) │
│        7 │ MULTIPOINT (631129.2726484938 4520715.513711404) │
│        8 │ MULTIPOINT (631132.1034797084 4520718.679540156) │
│        9 │ MULTIPOINT (631134.5425931576 4520716.805550539) │
│       10 │ MULTIPOINT (631132.4120283647 4520714.8504513)   │
├──────────┴──────────────────────────────────────────────────┤
│ 10 rows                                           2 columns │
└─────────────────────────────────────────────────────────────┘

But wouldn't it be more correct for duckdb to handle the correct mapping of x and y by default on its own, without having to specify it?

Thank you

@kylebarron
Copy link

The EPSG:4326 definition defines axis order as latitude, longitude. So the "correct" way of transforming that uses the prescribed axis order, which is your first example. But in practice, many datasets store coordinates in longitude-latitude, x-y order anyways, so you need to pass always_xy=True to override the CRS definition.

@aborruso
Copy link
Contributor Author

aborruso commented Jan 5, 2025

The EPSG:4326 definition defines axis order as latitude, longitude. So the "correct" way of transforming that uses the prescribed axis order, which is your first example. But in practice, many datasets store coordinates in longitude-latitude, x-y order anyways, so you need to pass always_xy=True to override the CRS definition.

Hi @kylebarron and thank you. Please help me understand a little better.

In my sample file, if I run ogrinfo I have "Data axis to CRS axis mapping: 2,1", and I read multipoint in lon, lat (as this one MULTIPOINT ((16.5551536094489 40.8270039463864))).

The core issue is that the data appears to be presented with longitude first and latitude second, within the MULTIPOINT geometry, while the specified Coordinate Reference System (CRS), EPSG:4326, expects the opposite order: latitude first, then longitude.

  • Data axis to CRS axis mapping: 2,1: This line should confirm the issue. It means that the 2nd axis of the data (which is longitude in this case) maps to the first axis of the CRS (which is latitude) and vice-versa.
  • EPSG:4326: This is the standard WGS 84 geographic coordinate system. It defines a convention for geographic coordinates where the latitude is the first coordinate and the longitude is the second coordinate.
  • MULTIPOINT ((16.5551536094489 40.8270039463864)): The coordinates provided here, 16.55515... and 40.82700..., are inconsistent with the lat/lon order of EPSG:4326. In this context, 16.555... likely represents the longitude, and 40.827... represents the latitude.

If this analysis of mine is correct, could it not be done automatically so that whenever duckdb reads (via gdal) that Data axis to CRS axis mapping: 2,1, and that the EPSG=4326, always_xy := true is applied?

Thank you

@rouault
Copy link

rouault commented Jan 8, 2025

If this analysis of mine is correct,

I confirm it is correct. OGRSpatialReference::GetDataAxisToSRSAxisMapping() can be used to establish the correspondance between the coordinate order in OGRGeometry and the axis order of the official CRS definition.

@aborruso
Copy link
Contributor Author

aborruso commented Jan 8, 2025

I confirm it is correct. OGRSpatialReference::GetDataAxisToSRSAxisMapping() can be used to establish the correspondance between the coordinate order in OGRGeometry and the axis order of the official CRS definition.

Okay, so I think it would be very important to implement its use here in duckdb spatial. Do I open a feature request?

Thank you all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants