points DWithin other points

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

points DWithin other points

juli g. pausas
Hi list,
I have a table with about 200,000 points (table: WDplots), and another larger table with also point (in fact, fires; table: n3_cmg_terra). I'd like to know how many of the points (fires) in the n3_cmg_terra are close (e.g., 0.5 km radius) to my plots (WDplots). So I compute nfires (and also avgfrp), as following:

UPDATE WDplots AS t1 SET nfires= tt.nfires,
                                           avgfrp= tt.avgfrp
    FROM (
             SELECT sa.gid AS gid, count(*) AS nfires, avg(fi.frp) as avgfrp
             FROM WDplots AS sa,
                  n3_cmg_terra AS fi
               WHERE ST_DWithin(geography(fi.geom), geography(sa.geom), 500)
               AND (ST_Buffer(geography(fi.geom), 500) && sa.geom)
               AND fi.dist < 100
               AND confidence >30
               AND EXTRACT(YEAR FROM fi.date) BETWEEN 2002 AND 2015
               GROUP BY sa.gid ORDER BY sa.gid
    ) AS tt
    WHERE (t1.gid = tt.gid);


But this has been run for 26 days and has not finished yet (I've stopped!). My computer is not that bad (see below), so I'm sure I'm doing something wrong (very wrong probable). Any help?

The two tables have indices:

CREATE INDEX wdplots_geom_gist
  ON wdplots
  USING gist (geom);

CREATE INDEX wdplots_gid_index
  ON wdplots
  USING btree   (gid);

 CREATE INDEX n3_cmg_terra_geom_gist
  ON n3_cmg_terra
  USING gist  (geom);

Thanks for any comment.

Juli

Computer: Intel® Xeon(R) CPU E5-2620 v4 @ 2.10GHz × 32, 125,8 GiB, on Ubuntu 16.04

SELECT version();
"PostgreSQL 9.5.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit"

SELECT PostGIS_version();
"2.2 USE_GEOS=1 USE_PROJ=1 USE_STATS=1"


--
CIDE, CSIC  |  www.uv.es/jgpausas  |  blog


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: points DWithin other points

Darafei "Komяpa" Praliaskouski
Please share your query plan using EXPLAIN.

https://explain.depesz.com/ 

вт, 25 апр. 2017 г. в 14:17, juli g. pausas <[hidden email]>:
Hi list,
I have a table with about 200,000 points (table: WDplots), and another larger table with also point (in fact, fires; table: n3_cmg_terra). I'd like to know how many of the points (fires) in the n3_cmg_terra are close (e.g., 0.5 km radius) to my plots (WDplots). So I compute nfires (and also avgfrp), as following:

UPDATE WDplots AS t1 SET nfires= tt.nfires,
                                           avgfrp= tt.avgfrp
    FROM (
             SELECT sa.gid AS gid, count(*) AS nfires, avg(fi.frp) as avgfrp
             FROM WDplots AS sa,
                  n3_cmg_terra AS fi
               WHERE ST_DWithin(geography(fi.geom), geography(sa.geom), 500)
               AND (ST_Buffer(geography(fi.geom), 500) && sa.geom)
               AND fi.dist < 100
               AND confidence >30
               AND EXTRACT(YEAR FROM fi.date) BETWEEN 2002 AND 2015
               GROUP BY sa.gid ORDER BY sa.gid
    ) AS tt
    WHERE (t1.gid = tt.gid);


But this has been run for 26 days and has not finished yet (I've stopped!). My computer is not that bad (see below), so I'm sure I'm doing something wrong (very wrong probable). Any help?

The two tables have indices:

CREATE INDEX wdplots_geom_gist
  ON wdplots
  USING gist (geom);

CREATE INDEX wdplots_gid_index
  ON wdplots
  USING btree   (gid);

 CREATE INDEX n3_cmg_terra_geom_gist
  ON n3_cmg_terra
  USING gist  (geom);

Thanks for any comment.

Juli

Computer: Intel® Xeon(R) CPU E5-2620 v4 @ 2.10GHz × 32, 125,8 GiB, on Ubuntu 16.04

SELECT version();
"PostgreSQL 9.5.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit"

SELECT PostGIS_version();
"2.2 USE_GEOS=1 USE_PROJ=1 USE_STATS=1"


--
CIDE, CSIC  |  www.uv.es/jgpausas  |  blog

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: points DWithin other points

Hugues François
In reply to this post by juli g. pausas
Hello,

If I'm not wrong, I think the on the fly geography casting can be costly and you may loose the benefit of your index for st_dwithin function. You may try to add a geography column to store your data in geography format and then add a spatial index. 

I also think that the bbox condition is redundant with st_dwithin and st_buffer is quite long to run. 

Regards,

Hug


Le 25 avr. 2017 1:17 PM, "juli g. pausas" <[hidden email]> a écrit :
Hi list,
I have a table with about 200,000 points (table: WDplots), and another larger table with also point (in fact, fires; table: n3_cmg_terra). I'd like to know how many of the points (fires) in the n3_cmg_terra are close (e.g., 0.5 km radius) to my plots (WDplots). So I compute nfires (and also avgfrp), as following:

UPDATE WDplots AS t1 SET nfires= tt.nfires,
                                           avgfrp= tt.avgfrp
    FROM (
             SELECT sa.gid AS gid, count(*) AS nfires, avg(fi.frp) as avgfrp
             FROM WDplots AS sa,
                  n3_cmg_terra AS fi
               WHERE ST_DWithin(geography(fi.geom), geography(sa.geom), 500)
               AND (ST_Buffer(geography(fi.geom), 500) && sa.geom)
               AND fi.dist < 100
               AND confidence >30
               AND EXTRACT(YEAR FROM fi.date) BETWEEN 2002 AND 2015
               GROUP BY sa.gid ORDER BY sa.gid
    ) AS tt
    WHERE (t1.gid = tt.gid);


But this has been run for 26 days and has not finished yet (I've stopped!). My computer is not that bad (see below), so I'm sure I'm doing something wrong (very wrong probable). Any help?

The two tables have indices:

CREATE INDEX wdplots_geom_gist
  ON wdplots
  USING gist (geom);

CREATE INDEX wdplots_gid_index
  ON wdplots
  USING btree   (gid);

 CREATE INDEX n3_cmg_terra_geom_gist
  ON n3_cmg_terra
  USING gist  (geom);

Thanks for any comment.

Juli

Computer: Intel® Xeon(R) CPU E5-2620 v4 @ 2.10GHz × 32, 125,8 GiB, on Ubuntu 16.04

SELECT version();
"PostgreSQL 9.5.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit"

SELECT PostGIS_version();
"2.2 USE_GEOS=1 USE_PROJ=1 USE_STATS=1"


--
CIDE, CSIC  |  www.uv.es/jgpausas  |  blog



_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: points DWithin other points

juli g. pausas
Many thanks! these corrections has been very helpful.

ST_Buffer && geom was very slow, and apparently not needed.
An using a new column for the geography (and adding and Index to that col) increased the speed a lot.

With these 2 changes, the query need only 7 minutes to run the 200 000 points.
Thanks!



Juli
--
CIDE, CSIC  |  www.uv.es/jgpausas  |  blog


On Tue, Apr 25, 2017 at 2:15 PM, Hugues François <[hidden email]> wrote:
Hello,

If I'm not wrong, I think the on the fly geography casting can be costly and you may loose the benefit of your index for st_dwithin function. You may try to add a geography column to store your data in geography format and then add a spatial index. 

I also think that the bbox condition is redundant with st_dwithin and st_buffer is quite long to run. 

Regards,

Hug


Le 25 avr. 2017 1:17 PM, "juli g. pausas" <[hidden email]> a écrit :
Hi list,
I have a table with about 200,000 points (table: WDplots), and another larger table with also point (in fact, fires; table: n3_cmg_terra). I'd like to know how many of the points (fires) in the n3_cmg_terra are close (e.g., 0.5 km radius) to my plots (WDplots). So I compute nfires (and also avgfrp), as following:

UPDATE WDplots AS t1 SET nfires= tt.nfires,
                                           avgfrp= tt.avgfrp
    FROM (
             SELECT sa.gid AS gid, count(*) AS nfires, avg(fi.frp) as avgfrp
             FROM WDplots AS sa,
                  n3_cmg_terra AS fi
               WHERE ST_DWithin(geography(fi.geom), geography(sa.geom), 500)
               AND (ST_Buffer(geography(fi.geom), 500) && sa.geom)
               AND fi.dist < 100
               AND confidence >30
               AND EXTRACT(YEAR FROM fi.date) BETWEEN 2002 AND 2015
               GROUP BY sa.gid ORDER BY sa.gid
    ) AS tt
    WHERE (t1.gid = tt.gid);


But this has been run for 26 days and has not finished yet (I've stopped!). My computer is not that bad (see below), so I'm sure I'm doing something wrong (very wrong probable). Any help?

The two tables have indices:

CREATE INDEX wdplots_geom_gist
  ON wdplots
  USING gist (geom);

CREATE INDEX wdplots_gid_index
  ON wdplots
  USING btree   (gid);

 CREATE INDEX n3_cmg_terra_geom_gist
  ON n3_cmg_terra
  USING gist  (geom);

Thanks for any comment.

Juli

Computer: Intel® Xeon(R) CPU E5-2620 v4 @ 2.10GHz × 32, 125,8 GiB, on Ubuntu 16.04

SELECT version();
"PostgreSQL 9.5.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit"

SELECT PostGIS_version();
"2.2 USE_GEOS=1 USE_PROJ=1 USE_STATS=1"


--
CIDE, CSIC  |  www.uv.es/jgpausas  |  blog



_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users