Spatially ordering data

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Spatially ordering data

Jonathan Moules-2
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Darafei "Komяpa" Praliaskouski
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Rémi Cura
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Lars Aksel Opsahl-2

Hi


Here https://github.com/larsop/content_balanced_grid is some generic code that you can use for this. 



The sql below will return a grid that contains max 4000 touching objects pr cell. If there is low density of objects in a area, the cells for that area will be bigger than the cells in a area with high density of objects. This means that the cells will we have varying sizes in the grid. This code only counts number of objects and not the number points in each object.



SELECT q_grid.cell::geometry(geometry,4258)  as geo 
FROM (
SELECT(ST_Dump(
cbg_content_based_balanced_grid(ARRAY['org_ar5.ar5_flate geo'],4000))
).geom AS cell) AS q_grid;

Lars


Fra: postgis-users <[hidden email]> på vegne av Rémi Cura <[hidden email]>
Sendt: 1. mars 2017 13:11
Til: PostGIS Users Discussion
Emne: Re: [postgis-users] Spatially ordering data
 
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Travis Kirstine
In reply to this post by Rémi Cura
if you go with Remi suggestion you then can cluster the quad tree index to reorder records  

http://postgis.net/docs/performance_tips.html



On 1 March 2017 at 07:11, Rémi Cura <[hidden email]> wrote:
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Jonathan Moules-2
Thanks for the suggestions. Some thoughts after testing:


Darafei: (order by ST_Geohash) Interesting notion, but when implemented the grouping is tenuous at best. If I return 5,000 features they can be spread over 200km (bad) or 6km (good). It'd definitely better clustered than without this, but I was surprised at how dispersed some of the groups were, and that may be too much for my purposes.
Although I see Paul's answer to this SO question (https://gis.stackexchange.com/questions/188879/clustering-markers-using-quadtree/189296) may help. I've not tested it yet.

Lars: (content_balanced_grid) The issue here is what happens to features that overlap the grid edges, as I only want to process any given polygon once, and I can't clip them? I'd have to use a ST_Within to only get polygons inside the CBG (which is fine), but the remaining features would need an ST_Overlaps against the entire grid. I've been running that 30 mins so far on a small sample of my full database, and it's still running, so it definitely won't scale.

Remi: I was looking for something higher level. Implementing a Quad-Tree in SQL isn't something I'd be comfortable doing. Although if it is simple to do, it may be worth implementing as a feature within PostGIS; I'm sure others would also benefit from being able to easily spatially sort. I tried googling PostGIS quadtree but didn't find anything so I'm guessing PG doesn't implement this type of index by default.

Are there any other options? I may consider trying to glue together the two approaches, but that's going to take some thought.

Thanks,
Jonathan

---- On Wed, 01 Mar 2017 12:56:58 +0000 Travis Kirstine <[hidden email]> wrote ----
if you go with Remi suggestion you then can cluster the quad tree index to reorder records  

http://postgis.net/docs/performance_tips.html



On 1 March 2017 at 07:11, Rémi Cura <[hidden email]> wrote:
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users



_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Burgholzer, Robert (DEQ)
How about:
1. Take a centroid of each shape,
2. Rounding it’s x and y coordinates to whole lat/lon values
3. Ordering by lat_round, lon_round and these values would proceed in a gridded fashion.  

Ex:  select oid, floor(st_x(the_geom)) as x_index, floor(st_y(the_geom)) as y_index from precip_noaa_daily_grid order by x_index, y_index;
  oid  | x_index | y_index
-------+---------+---------
  2143 |     -84 |      36
  2142 |     -84 |      36
  2141 |     -84 |      36
  2620 |     -84 |      36
  2619 |     -84 |      36
  2618 |     -84 |      36


The only drawback is that if there were data that were added during processing, they may be skipped by your processor, however, if you were to set a flag to indicate that something was processed, you’d be covered.

Hth,
/r/b

From: postgis-users [mailto:[hidden email]] On Behalf Of Jonathan Moules
Sent: Thursday, March 02, 2017 12:42 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Spatially ordering data

Thanks for the suggestions. Some thoughts after testing:


Darafei: (order by ST_Geohash) Interesting notion, but when implemented the grouping is tenuous at best. If I return 5,000 features they can be spread over 200km (bad) or 6km (good). It'd definitely better clustered than without this, but I was surprised at how dispersed some of the groups were, and that may be too much for my purposes.
Although I see Paul's answer to this SO question (https://gis.stackexchange.com/questions/188879/clustering-markers-using-quadtree/189296) may help. I've not tested it yet.

Lars: (content_balanced_grid) The issue here is what happens to features that overlap the grid edges, as I only want to process any given polygon once, and I can't clip them? I'd have to use a ST_Within to only get polygons inside the CBG (which is fine), but the remaining features would need an ST_Overlaps against the entire grid. I've been running that 30 mins so far on a small sample of my full database, and it's still running, so it definitely won't scale.

Remi: I was looking for something higher level. Implementing a Quad-Tree in SQL isn't something I'd be comfortable doing. Although if it is simple to do, it may be worth implementing as a feature within PostGIS; I'm sure others would also benefit from being able to easily spatially sort. I tried googling PostGIS quadtree but didn't find anything so I'm guessing PG doesn't implement this type of index by default.

Are there any other options? I may consider trying to glue together the two approaches, but that's going to take some thought.

Thanks,
Jonathan

---- On Wed, 01 Mar 2017 12:56:58 +0000 Travis Kirstine <[hidden email]> wrote ----
if you go with Remi suggestion you then can cluster the quad tree index to reorder records  

http://postgis.net/docs/performance_tips.html



On 1 March 2017 at 07:11, Rémi Cura <[hidden email]> wrote:
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Tim Keitt
In reply to this post by Jonathan Moules-2



On Wed, Mar 1, 2017 at 5:28 AM, Jonathan Moules <[hidden email]> wrote:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

Can you define "spatial chunks"? Its pretty hard to answer if this is not precise.
 

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

If you want to limit the number of rows, declare a cursor (look up in postgres docs) with a maximum number of rows to return.

THK
 

Thanks,
Jonathan

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Lars Aksel Opsahl-2
In reply to this post by Jonathan Moules-2

Hi


Thanks for testing it. We have used it quite bit and have not seen any scaling problems yet.


For a layer with 8 million polygons (about 1 billion points) it takes about one minute to create a content based grid. (Se example further down)


We use the && operator to check intersection to avoid loading all data, but what sometimes take time is to compute the extent of the layer. Before I used ST_estimatedExtent (https://github.com/larsop/content_balanced_grid/commit/66c8267590543951c1b5088ff021cc5b70e0acff) but that function gives to small extent in some cases. To use the ST_estimatedExtent, we have to use add some more tests to ensure that the extent is big enough, but since we have not seen any need yet its not done either.


We usually use this kind of grids to be able to run jobs i parallel with about equal load in each job. To be sure that each polygon is only handled by one job we can use ST_Centroid(or faster first point in exterior ring) and ST_CoveredBy cell.bTo be 100% sure you also have to check for ST_Centroid(or first point) on cell borders.



SELECT count(q_grid.cell::geometry(geometry,4258)) as geo

FROM (

SELECT(ST_Dump(

cbg_content_based_balanced_grid(ARRAY['org_ar5.ar5_flate geo'],4000))

).geom AS cell) AS q_grid

geo

------

5245

(1 row)

Time: 62370.532 ms


SELECT count(*) from org_ar5.ar5_flate;

count

---------

8263246

(1 row)


SELECT PostGIS_Full_Version();

postgis_full_version

--------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------

POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.6" LIBJSON="0.11" TOP

OLOGY RASTER



Lars




Fra: postgis-users <[hidden email]> på vegne av Jonathan Moules <[hidden email]>
Sendt: 2. mars 2017 18:41
Til: PostGIS Users Discussion
Emne: Re: [postgis-users] Spatially ordering data
 
Thanks for the suggestions. Some thoughts after testing:


Darafei: (order by ST_Geohash) Interesting notion, but when implemented the grouping is tenuous at best. If I return 5,000 features they can be spread over 200km (bad) or 6km (good). It'd definitely better clustered than without this, but I was surprised at how dispersed some of the groups were, and that may be too much for my purposes.
Although I see Paul's answer to this SO question (https://gis.stackexchange.com/questions/188879/clustering-markers-using-quadtree/189296) may help. I've not tested it yet.

Lars: (content_balanced_grid) The issue here is what happens to features that overlap the grid edges, as I only want to process any given polygon once, and I can't clip them? I'd have to use a ST_Within to only get polygons inside the CBG (which is fine), but the remaining features would need an ST_Overlaps against the entire grid. I've been running that 30 mins so far on a small sample of my full database, and it's still running, so it definitely won't scale.

Remi: I was looking for something higher level. Implementing a Quad-Tree in SQL isn't something I'd be comfortable doing. Although if it is simple to do, it may be worth implementing as a feature within PostGIS; I'm sure others would also benefit from being able to easily spatially sort. I tried googling PostGIS quadtree but didn't find anything so I'm guessing PG doesn't implement this type of index by default.

Are there any other options? I may consider trying to glue together the two approaches, but that's going to take some thought.

Thanks,
Jonathan

---- On Wed, 01 Mar 2017 12:56:58 +0000 Travis Kirstine <[hidden email]> wrote ----
if you go with Remi suggestion you then can cluster the quad tree index to reorder records  

http://postgis.net/docs/performance_tips.html



On 1 March 2017 at 07:11, Rémi Cura <[hidden email]> wrote:
hey,
the most classical approach would be to construct a quad tree.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Komяpa" Praliaskouski <[hidden email]>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/ 

for simplest case, you can just order by your geom field.

ср, 1 мар. 2017 г. в 14:44, Jonathan Moules <[hidden email]>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0
   
Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users



_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatially ordering data

Jonathan Moules-2
In reply to this post by Burgholzer, Robert (DEQ)
Hi List,
    Thanks for all the suggestions. I think I've solved this using a combination of methods from the responses and wanted to share for anyone in the future.
   
Things I tried that didn't work for me (beyond what was in the previous post):
* Convert to a point, then extract the Lat/Lon and integer ORDERing by that (as per Robert's suggestion) - this didn't work well because it tends to produce long strips which isn't suitable for my purposes.

* Mark's linked Quadtree generation SQL. But I couldn't get it to work.

What did work:
I took the Content Based Grid from previously (which seems to basically be a form of Quadtree anyway) that I had created with Lar's tool, and then grabbed the bounds of it.

For each grid, I could then run this SQL:

select
    polygon_id,
    geom
from (
    select
        polygon_id,
        geom,
        floor(st_x(p_geom)) as x_index,
        floor(st_y(p_geom)) as y_index
    from (
        select
            polygon_id,
            geom,
            ST_PointN(ST_Exteriorring(geom), 1) as p_geom
        from
            mdms.polygon_active
    ) a
) b
where
    x_index >= @Value(_xmin) and x_index < @Value(_xmax)
    AND
    Y_index >= @Value(_ymin) and y_index < @Value(_ymax)


This was run from FME, hence the "@Value()" bits in the WHERE clause, which substitute the Grid bounds values when sent to SQL. I'm sure the entire thing could be a single SQL query, but that I leave as an exercise for the reader.

For those who, like me, don't have SQL as their first or second language:
1) The innermost select converts the polygons to points (i.e. "ST_PointN(ST_Exteriorring(geom), 1)"). I'm using this method because it's considerably faster than ST_Centroid, and I don't care where the point is in relation to the geometry.

2) The mid-select get's the point coordinates (i.e. "floor(st_x(p_geom)) as x_index")

3) The outer clause uses the WHERE query to limit the points to only include ones that are greater than or EQUAL to the min bounds and less than the max bounds. This solves the problem of what happens if a point is on a shared boundary - it will be assigned to the grid whose minimum it sits on.

This processes reads and assigns a sample set of 1.3 million points in about 4 minutes.

The result of this is lots of polygons that are all spatially clustered together, in groups of no more than 500,000 polygons (that's the value I used to create the Grid).

The only problem I'm aware of is the fact that any points that sit on the very maximum boundary of the entire grid may not be read as they will be equal to the maximum. I haven't come up with a good solution to that one other than to re-read it all.

Hope that helps someone. I think spatial sorting would be a great feature for PostGIS to incorporate by way of a formal function.

Cheers,
Jonathan


---- On Thu, 02 Mar 2017 18:38:28 +0000 Burgholzer<[hidden email]> wrote ----
How about:
1. Take a centroid of each shape,
2. Rounding it’s x and y coordinates to whole lat/lon values
3. Ordering by lat_round, lon_round and these values would proceed in a gridded fashion.

Ex: select oid, floor(st_x(the_geom)) as x_index, floor(st_y(the_geom)) as y_index from precip_noaa_daily_grid order by x_index, y_index;
oid | x_index | y_index
-------+---------+---------
2143 | -84 | 36
2142 | -84 | 36
2141 | -84 | 36
2620 | -84 | 36
2619 | -84 | 36
2618 | -84 | 36


The only drawback is that if there were data that were added during processing, they may be skipped by your processor, however, if you were to set a flag to indicate that something was processed, you’d be covered.

Hth,
/r/b


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Loading...