Improve 3D BB query on large set of lines

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Improve 3D BB query on large set of lines

Tom Kazimiers
Hi all,

I am using Postgres 9.6 with PostGIS 2.3 and want to improve the
performance of a bounding box intersection query on a large set of
lines. I asked the same questions two month ago on
gis.stackexchange.com, but didn't get any reply. Therefore, please
excuse me for cross-posting, but I figured this list would be a good
place to ask. This is the original post:

  https://gis.stackexchange.com/questions/215762

Problem: We store about 6 million lines/edges in a 3D space and use an axis
aligned bounding box to find all intersecting (and included) lines. This works
well, but we want to improve the query time for larger point sets.

Context: Tree-like 3D structures represent neurons, each node has a parent node
or it is the root. At the moment we deal with about 15 million nodes, grouped
into 150000 trees (many > 10000 nodes). I want to improve existing performance
bottlenecks with bigger result sets plus I plan to scale this setup to 10-100x
the nodes. Typical query bounding boxes are rather slices than boxes,
i.e. they expand any range in XY, but only a short distance in Z, but it
could in principal any dimension that is "flat".

Setup: The table storing edges looks like this:

  =>\d+ treenode_edge
                   Tabelle »public.treenode_edge«
         Spalte   |          Typ          | Attribute
  ------------+-----------------------+-----------
   id         | bigint                | not null
   project_id | integer               | not null
   edge       | geometry(LineStringZ) |
  Indexe:
          "treenode_edge_pkey" PRIMARY KEY, btree (id)
          "treenode_edge_gix" gist (edge gist_geometry_ops_nd)
          "treenode_edge_project_id_index" btree (project_id)

Note that there is a 3D index for the edge column in place (treenode_edge_gix).

Current timing: To request 2000 nodes with an typical bounding box size I use
the following query:

  SELECT te.id
  FROM treenode_edge te
                                                                 -- left bottom z2, right top z1  
  WHERE te.edge &&& 'LINESTRINGZ( -537284.0  699984.0 84770.0,
                                                                  1456444.0 -128432.0 84735.0)'
                                   -- left top halfz, right top halfz,
                                   -- right bottom halfz, left bottom halfz,
                                   -- left top halfz; halfz (distance)
  AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
          'LINESTRING( -537284.0 -128432.0 84752.5, 1456444.0 -128432.0 84752.5,
                                   1456444.0  699984.0 84752.5, -537284.0  699984.0 84752.5,
                                   -537284.0 -128432.0 84752.5)')), 17.5)
  AND te.project_id = 1
  LIMIT 2000;

This takes about 900ms, statistics are up-to-date. This is not so bad,
but I look for strategies to make this much faster still. The &&&
operator filters already most of the existing edges by bounding box test
(using index) so that ST_3DWithin only needs to check the edges that are
most likely part of the result.

The query plan looks like this:

  Limit  (cost=48.26..4311.24 rows=70 width=8) (actual time=856.261..864.208 rows=2000 loops=1)
         Buffers: shared hit=20470
         ->  Bitmap Heap Scan on treenode_edge te  (cost=48.26..4311.24 rows=70 width=8) (actual time=856.257..863.974 rows=2000 loops=1)
                   Recheck Cond: (edge &&& '01020000800200000000000000886520C100000000A05C25410000000020B2F440000000003C39364100000000005BFFC000000000F0AFF440'::geometry)
                   Filter: ((edge && '0103000080010000000500000000000000AB6520C100000000185CFFC000000000F0AFF44000000000AB6520C100000000C35C254100000000F0AFF440000000804D39364100000000C35C25410000000020B2F440000000804D39364100000000185CFFC00000000020B2F44000000000AB6520C100000000185CFFC000000000F0AFF440'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000005BFFC00000000008B1F440000000003C39364100000000005BFFC00000000008B1F440000000003C39364100000000A05C25410000000008B1F44000000000886520C100000000A05C25410000000008B1F44000000000886520C100000000005BFFC00000000008B1F440'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000005BFFC00000000008B1F440000000003C39364100000000005BFFC00000000008B1F440000000003C39364100000000A05C25410000000008B1F44000000000886520C100000000A05C25410000000008B1F44000000000886520C100000000005BFFC00000000008B1F440'::geometry, '17.5'::double precision))
                   Heap Blocks: exact=1816
                   Buffers: shared hit=20470
                   ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..48.25 rows=1044 width=0) (actual time=855.795..855.795 rows=2856 loops=1)
                                 Index Cond: (edge &&& '01020000800200000000000000886520C100000000A05C25410000000020B2F440000000003C39364100000000005BFFC000000000F0AFF440'::geometry)
                                 Buffers: shared hit=18654
   Planning time: 3.467 ms
   Execution time: 864.614 ms

This shows clearly that the big bounding box test on all existing edges is the
problem, though it already makes use of the GiST index. Are there maybe
other indices available that can discard many far-away bounding boxes
without looking at them, i.e. that useses a hierarchy like an octree? So
far I didn't find any, but it looked like the SP-GiST interface might be
useful if one were to create a new one.

Alternatives: There is one approach that I currently implement to improve this,
but I'd like to hear if someone has alternative, possibly better or other
suggestions?

This is what I tried: create a new table that would partition the space
into many cubes and store for each all intersecting edges. New, updated or
deleted nodes would then lead to an update of the relevant cubes, something
that would probably have to run asynchronously (didn't implement the
update yet). Queries then pre-filtered the space by only looking at
relevant cubes instead of using the &&& operator with the bounding box
of each edge. For a range of cubse sizes this improved query time quite
a bit, the above query would run in about a third of thet time above.
However, I realize cell sizes will behave differently for different
dataset, but I didn't try to implement a dynamic cube size update, yet.  
Thinking about it, it makes me wonder if it wouldn't be best if I would
implement a octree---table based or as an extension.

Now I also saw the pointcloud extension and its indexing looked very
interesting (octree), but I couldn't really see how it could work in my
use case. But I would appreciate ideas if you see a way to make it work.

Best,
Tom
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: Improve 3D BB query on large set of lines

Rémi Cura
Hey,
This is an interesting problem for sure.
AS always in the "slow query" topic,
improvement can come from a number of places
 - hardware : are your tables on SSD, does your index fit in RAM
 - postgres tuning : you have customised postgresql.conf to fit your hardware
 - query writing : your query is optimally written (looks fine here)
 - data model (detailed after)

IF you want fast results, you need a good indexing, and this is possible if you exploit the sparsness of your data.
So where is the sparsness?




 * Scenario A.1: your usual slices affects a majority of trees
  - you don't really need an index on X,Y, but mainly on Z.
    The idea is to separate your index into index(X,Y) + index(Z)
     Try adding an index on GIST(numrange(ST_Zmin,ST_ZMax)) (you could add it anyway)
     and an index on GIST(geom) (aka only 2D).
 * Scenario A.2 : your usual slices affect few trees
  - you could easily perform a 2 part query, where the first part use an index on tree bounding box, and the second perform the query on edge. This require to create and maintain a tree_bounding_box table with appropriate indexes, which is quite easy with triggers Ex (not optimised):
   WITH filtering_by_tree AS (
     SELECT distinct tree_id
     FROM my_sync_table_of_tree_bounding_boxes as tree
     WHERE tree.bbox &&& your_slice
   ) SELECT edge_id
     FROM treenode_edge  AS te
     WHERE te.geom &&& your_slice
      AND EXISTS (SELECT 1 FROM filtering_by_tree As ft WHERE ft.tree_id = te.tree_id)
  - you could use pgpointcloud to similar effect by construction a patch per tree.

Now something is very ambiguous in your data, and it could have major consequences :
 * Are the edges between nodes that represent neuron __only logical__ (they are straight lines, always the same Zmax-Zmin, and represent a conceptual graph), or are they polylines that represent the __actual__ neurone path?

 * Scenario B.1.1 : edges are only logical and have a constant Zmax-Zmin, i.e. nodes have quantified Z :
  - you don't need edges at all for indexing, you should work only on nodes (i.e. index nodes IN Z + (X,Y), find nodes, then find relevant edges). The great advantage in this case is that you can use the full power of pgpointcloud to create meaningfull patches (of variables size for instance, see this, 3.1, Adapt patch grouping rules ), which means scalling in the range of billions of points possible.

  - you could try to think of using the dual of your current graph (edges become nodes, nodes become edges).

 * Scenario B.1.2 : edges are only logical, but nodes don't have a constant Zmax-Zmin. : you could insert (virtual) nodes so you are again in the same case as scenario B.1.1
 
 * Scenario B.2 : edges are actual neurone path.
  - you could construct intermediary objects (groups of neurone) of variable size (those within cubes), so as to have again a 2 steps query : first look for groups of neuron, then for neurons inside

 * Scenario C : your data don't have obvious sparsity.
  - use the same strategy you currently use, but scale by partitionning your data into numerous tables.
   You can see the partitionning entry of the postgres manual.
   This would be much easier to create and maintain if you could partition on only Z dimension.
   Note that you will need to duplicate (and then de-duplicate) some edges.


It's hard to recommend anything without further information on your data/your requirement/your ressources.

Cheers,
Rémi-C


 





2017-01-17 22:43 GMT+01:00 Tom Kazimiers <[hidden email]>:
Hi all,

I am using Postgres 9.6 with PostGIS 2.3 and want to improve the performance of a bounding box intersection query on a large set of lines. I asked the same questions two month ago on gis.stackexchange.com, but didn't get any reply. Therefore, please excuse me for cross-posting, but I figured this list would be a good place to ask. This is the original post:

 https://gis.stackexchange.com/questions/215762

Problem: We store about 6 million lines/edges in a 3D space and use an axis
aligned bounding box to find all intersecting (and included) lines. This works
well, but we want to improve the query time for larger point sets.

Context: Tree-like 3D structures represent neurons, each node has a parent node
or it is the root. At the moment we deal with about 15 million nodes, grouped
into 150000 trees (many > 10000 nodes). I want to improve existing performance
bottlenecks with bigger result sets plus I plan to scale this setup to 10-100x
the nodes. Typical query bounding boxes are rather slices than boxes, i.e. they expand any range in XY, but only a short distance in Z, but it could in principal any dimension that is "flat".

Setup: The table storing edges looks like this:

 =>\d+ treenode_edge
                   Tabelle »public.treenode_edge«
         Spalte   |          Typ          | Attribute  ------------+-----------------------+-----------
  id         | bigint                | not null
  project_id | integer               | not null
  edge       | geometry(LineStringZ) |  Indexe:
          "treenode_edge_pkey" PRIMARY KEY, btree (id)
          "treenode_edge_gix" gist (edge gist_geometry_ops_nd)
          "treenode_edge_project_id_index" btree (project_id)

Note that there is a 3D index for the edge column in place (treenode_edge_gix).

Current timing: To request 2000 nodes with an typical bounding box size I use
the following query:

 SELECT te.id
 FROM treenode_edge te
                                                                 -- left bottom z2, right top z1    WHERE te.edge &&& 'LINESTRINGZ( -537284.0  699984.0 84770.0,
                                                                  1456444.0 -128432.0 84735.0)'
                                   -- left top halfz, right top halfz,
                                   -- right bottom halfz, left bottom halfz,
                                   -- left top halfz; halfz (distance)
 AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
          'LINESTRING( -537284.0 -128432.0 84752.5, 1456444.0 -128432.0 84752.5,
                                   1456444.0  699984.0 84752.5, -537284.0  699984.0 84752.5,
                                   -537284.0 -128432.0 84752.5)')), 17.5)
 AND te.project_id = 1
 LIMIT 2000;

This takes about 900ms, statistics are up-to-date. This is not so bad, but I look for strategies to make this much faster still. The &&& operator filters already most of the existing edges by bounding box test (using index) so that ST_3DWithin only needs to check the edges that are most likely part of the result.

The query plan looks like this:

 Limit  (cost=48.26..4311.24 rows=70 width=8) (actual time=856.261..864.208 rows=2000 loops=1)
         Buffers: shared hit=20470
         ->  Bitmap Heap Scan on treenode_edge te  (cost=48.26..4311.24 rows=70 width=8) (actual time=856.257..863.974 rows=2000 loops=1)
                   Recheck Cond: (edge &&& '01020000800200000000000000886520C100000000A05C25410000000020B2F440000000003C39364100000000005BFFC000000000F0AFF440'::geometry)
                   Filter: ((edge && '0103000080010000000500000000000000AB6520C100000000185CFFC000000000F0AFF44000000000AB6520C100000000C35C254100000000F0AFF440000000804D39364100000000C35C25410000000020B2F440000000804D39364100000000185CFFC00000000020B2F44000000000AB6520C100000000185CFFC000000000F0AFF440'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000005BFFC00000000008B1F440000000003C39364100000000005BFFC00000000008B1F440000000003C39364100000000A05C25410000000008B1F44000000000886520C100000000A05C25410000000008B1F44000000000886520C100000000005BFFC00000000008B1F440'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000005BFFC00000000008B1F440000000003C39364100000000005BFFC00000000008B1F440000000003C39364100000000A05C25410000000008B1F44000000000886520C100000000A05C25410000000008B1F44000000000886520C100000000005BFFC00000000008B1F440'::geometry, '17.5'::double precision))
                   Heap Blocks: exact=1816
                   Buffers: shared hit=20470
                   ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..48.25 rows=1044 width=0) (actual time=855.795..855.795 rows=2856 loops=1)
                                 Index Cond: (edge &&& '01020000800200000000000000886520C100000000A05C25410000000020B2F440000000003C39364100000000005BFFC000000000F0AFF440'::geometry)
                                 Buffers: shared hit=18654
  Planning time: 3.467 ms
  Execution time: 864.614 ms

This shows clearly that the big bounding box test on all existing edges is the
problem, though it already makes use of the GiST index. Are there maybe other indices available that can discard many far-away bounding boxes without looking at them, i.e. that useses a hierarchy like an octree? So far I didn't find any, but it looked like the SP-GiST interface might be useful if one were to create a new one.

Alternatives: There is one approach that I currently implement to improve this,
but I'd like to hear if someone has alternative, possibly better or other
suggestions?

This is what I tried: create a new table that would partition the space
into many cubes and store for each all intersecting edges. New, updated or
deleted nodes would then lead to an update of the relevant cubes, something
that would probably have to run asynchronously (didn't implement the update yet). Queries then pre-filtered the space by only looking at relevant cubes instead of using the &&& operator with the bounding box of each edge. For a range of cubse sizes this improved query time quite a bit, the above query would run in about a third of thet time above. However, I realize cell sizes will behave differently for different dataset, but I didn't try to implement a dynamic cube size update, yet.  Thinking about it, it makes me wonder if it wouldn't be best if I would implement a octree---table based or as an extension.

Now I also saw the pointcloud extension and its indexing looked very
interesting (octree), but I couldn't really see how it could work in my use case. But I would appreciate ideas if you see a way to make it work.

Best,
Tom
_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: Improve 3D BB query on large set of lines

Tom Kazimiers
Hi Rémi,

Thanks a lot for your thorough reply. Especially scenario A.1 has helped
me to improve performance by over an order of magnitude for our most
common use case. My reply got a bit longer, sorry, I wanted to provide a
clearer picture of my setup. For further tests I changed the system,
which is an exact mirror of my previous test setup, but this one has
SSDs. The generated query plan is the same, but the timing is of course
a little bit bitter. But for completeness, this is my original query
now:

  SELECT te.id
  FROM treenode_edge te
  WHERE te.edge &&& 'LINESTRINGZ( -537284.0 912464.0 131705.0,
                                  1456444.0  -340912.0 131670.0)'
   AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
    'LINESTRING( -537284.0 -340912.0 131687.5, 1456444.0 -340912.0 131687.5,
                 1456444.0  912464.0 131687.5, -537284.0  912464.0 131687.5,
                 -537284.0 -340912.0 131687.5)')), 17.5)
   AND te.project_id = 1;

Like before, I use &&& to get all edges that have a bounding box that
intersects with the query bounding box. ST_3DDWithin is used to lower
number of false positives, i.e. edges with an intersecting BB, but that
don't intersect the query BB. The query returns 6799 rows and runs in
about 600ms:

  Bitmap Heap Scan on treenode_edge te  (cost=77.04..6555.06 rows=112 width=8) (actual time=595.209..614.640 rows=6799 loops=1)
    Recheck Cond: (edge &&& '01020000800200000000000000886520C100000000A0D82B4100000000C8130041000000003C39364100000000C0CE14C100000000B0120041'::geometry)
    Filter: ((edge && '0103000080010000000500000000000000AB6520C10000000006CF14C100000000B012004100000000AB6520C100000000C3D82B4100000000B0120041000000804D39364100000000C3D82B4100000000C8130041000000804D3936410000000006CF14C100000000C813004100000000AB6520C10000000006CF14C100000000B0120041'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry, '17.5'::double precision))
    Rows Removed by Filter: 26
    Heap Blocks: exact=4120
    Buffers: shared hit=24703
    ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..77.01 rows=1679 width=0) (actual time=594.444..594.444 rows=6825 loops=1)
          Index Cond: (edge &&& '01020000800200000000000000886520C100000000A0D82B4100000000C8130041000000003C39364100000000C0CE14C100000000B0120041'::geometry)
          Buffers: shared hit=20583
  Planning time: 0.661 ms
  Execution time: 615.102 ms

The bitmap index scan on the ND GIST index is still the major problem.

On Thu, Jan 19, 2017 at 03:40:47PM +0100, Rémi Cura wrote:
> - hardware : are your tables on SSD, does your index fit in RAM

Our production and test system have the following setup, it is mainly
used for the database and a web-server (no other heavy processes, barely any load).

- 24 x Intel Xeon CPU E5-2630 v2 @ 2.60GHz
- 128GB RAM (all tables and indices fit into it)
- LSI MegaRAID SAS 2208, RAID 1, 1TB total, built from 2 x 1 TB Samsung
  SSD 850 PRO 1TB (this is where the Postgres data directory is)

The Postgres data directory has a size of 40GB and the size of one
complete dump of the database of interest is about 5GB at the moment. Of
course we made sure we mounted the partition with the noatime option and
we use the ext4 filesystem.

> - postgres tuning : you have customised postgresql.conf to fit your
>hardware

Yes, we modified it a bit. This is mainly what we changed:

max_connections = 250
shared_buffers = 16GB
work_mem = 256MB
maintenance_work_mem = 2GB
wal_buffers = 16MB
effective_cache_size = 16GB
default_statistics_target = 1000

These settings worked well so far for us on most queries. But I wonder
if less connections and more work_mem might be useful.

For the sake of completeness, we also tried the parallel worker feature of
Postgres 9.6, but couldn't see any query using it (since no sequential
were used in the first place, I believe).

> - query writing : your query is optimally written (looks fine here)

Thanks, this improves my confidence that I didn't miss anything obvious
on the query level. :-)

> - data model (detailed after)

I add some more details below, but first I want to add some more
context: typically we deal with individual nodes, stored in a separate
treenode table (the result of the PostGIS query is then joined into it).
Each node has a parent (except the root node) and the edge between them
is an actual part of the neuron (matches underlying image data). It will
always be a straight line, though. This is also why we need to respect
edges during queries: we need to know if the edge between two nodes
intersect with a query bounding box, even if the nodes themselves
aren't. Edges often cross multiple query bounding boxes/sections, i.e.  
nodes are multiple query bounding boxes/sections away. Users create new
nodes and modify or delete old ones all the time.

We query typically in sections of Z, but support also queries in
sections of X or Y. Ideally, all are equally fast. Z-sections are,
however, the clear focus.

>IF you want fast results, you need a good indexing, and this is
>possible if you exploit the sparsness of your data. So where is the
>sparsness?

The neurons/trees can cover quite a bit of space. There are many really
big ones that cross hundreds of slices. In my current test data set we
have about 10.000 neurons, with an average of 6000 nodes each. About
50.000 nodes get added per day. Most neurons span between 1000 and 2500
sections. An typical section we want to improve performance for might
involve 5000-10000 edge intersections which might belong to 1000-2000
different neurons/trees, but this also varies a lot by region.

> * Scenario A.1: your usual slices affects a majority of trees
>  - you don't really need an index on X,Y, but mainly on Z.
>    The idea is to separate your index into index(X,Y) + index(Z)
>     Try adding an index on GIST(numrange(ST_Zmin,ST_ZMax)) (you could add
>it anyway)
>     and an index on GIST(geom) (aka only 2D).

As long as I can also query from orthogonal perspectives with reasonable
performance, I don't need separate X and Y indices, it seems.

First, I adding the following indices:

  CREATE INDEX treenode_edge_z_range_gist ON treenode_edge
  USING GIST(numrange(ST_ZMin(edge)::numeric, ST_ZMax(edge)::numeric), '[]');

  CREATE INDEX treenode_edge_2d_gist ON treenode_edge USING GIST(edge);

Without the type casts, Postgres wouldn't allow it and inclusive bounds
are the current behavior. My regular query of course doesn't make use of
this index in its current form, but replacing the &&& operator test with
a 2D bound box check plus a Z index condition turns out to be really
fast:

  EXPLAIN (ANALYZE, BUFFERS) SELECT te.id
  FROM treenode_edge te
  WHERE te.edge && ST_MakeEnvelope(-537284.0, -340912.0, 1456444.0, 912464.0)
    AND numrange(ST_ZMin(te.edge)::numeric, ST_ZMax(te.edge)::numeric) && '[131670.0,131705.0]'::numrange
    AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
      'LINESTRING( -537284.0 -340912.0 131687.5, 1456444.0 -340912.0 131687.5,
                   1456444.0  912464.0 131687.5, -537284.0  912464.0 131687.5,
                   -537284.0 -340912.0 131687.5)')), 17.5)
    AND te.project_id = 1;
  ---
  Bitmap Heap Scan on treenode_edge te  (cost=3398.97..123260.14 rows=5007 width=8) (actual time=12.919..33.256 rows=6799 loops=1)
    Recheck Cond: (numrange((st_zmin((edge)::box3d))::numeric, (st_zmax((edge)::box3d))::numeric, '[]'::text) && '[131670.0,131705.0]'::numrange)
    Filter: ((edge && '0103000000010000000500000000000000886520C100000000C0CE14C100000000886520C100000000A0D82B41000000003C39364100000000A0D82B41000000003C39364100000000C0CE14C100000000886520C100000000C0CE14C1'::geometry) AND (edge && '0103000080010000000500000000000000AB6520C10000000006CF14C100000000B012004100000000AB6520C100000000C3D82B4100000000B0120041000000804D39364100000000C3D82B4100000000C8130041000000804D3936410000000006CF14C100000000C813004100000000AB6520C10000000006CF14C100000000B0120041'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry, '17.5'::double precision))
    Rows Removed by Filter: 26
    Heap Blocks: exact=4120
    Buffers: shared hit=4428
    ->  Bitmap Index Scan on treenode_edge_range_gist  (cost=0.00..3397.72 rows=75374 width=0) (actual time=12.351..12.351 rows=6825 loops=1)
          Index Cond: (numrange((st_zmin((edge)::box3d))::numeric, (st_zmax((edge)::box3d))::numeric, '[]'::text) && '[131670.0,131705.0]'::numrange)
          Buffers: shared hit=308
  Planning time: 0.856 ms
  Execution time: 33.731 ms

This is a very impressive speed-up, thanks for this suggestion! For all
examples I tried the results were exactly the same as with the &&&
operator. I only stumped across the fact the current semantic is to
have inclusive bounds, but numrange()'s behavior is '[)' by default.  
Being able to change this is also a nice feature.

I also tested creating range indices for X and Y so that I could replace
the &&& operator with three separate range checks. This, however, was
much slower with >2500ms. This setup would have been nicer, because it
could work equally well for "slice" queries in each dimension.

Anyhow, using your suggested indices and query works very well and fast
even for differently oriented slices. It is really suprising that the
two 2D index is faster than a 3D index for us.

> * Scenario A.2 : your usual slices affect few trees
>  - you could easily perform a 2 part query, where the first part use an
>index on tree bounding box, and the second perform the query on edge. This
>require to create and maintain a tree_bounding_box table with appropriate
>indexes, which is quite easy with triggers Ex (not optimised):
>   WITH filtering_by_tree AS (
>     SELECT distinct tree_id
>     FROM my_sync_table_of_tree_bounding_boxes as tree
>     WHERE tree.bbox &&& your_slice
>   ) SELECT edge_id
>     FROM treenode_edge  AS te
>     WHERE te.geom &&& your_slice
>      AND EXISTS (SELECT 1 FROM filtering_by_tree As ft WHERE ft.tree_id =
>te.tree_id)
>  - you could use pgpointcloud to similar effect by construction a patch
>per tree.
This is certainly also an appealing idea. I quickly tested this by
creating a neuron summary table with its current bounding boxes as box3d
and then queried in a similar way like you suggested above. I couldn't
get this very fast though, regardless of whether &&& or a 2D-approach is
in use. I typically ended up with about 500ms.

>Now something is very ambiguous in your data, and it could have major
>consequences :
> * Are the edges between nodes that represent neuron __only logical__ (they
>are straight lines, always the same Zmax-Zmin, and represent a conceptual
>graph), or are they polylines that represent the __actual__ neurone path?

Edges reflect the morphology of the actual neuron and their length
(even only in Z) can vary a lot.

> * Scenario B.1.1 : edges are only logical and have a constant Zmax-Zmin,
>i.e. nodes have quantified Z :
>  - you don't need edges at all for indexing, you should work only on nodes
>(i.e. index nodes IN Z + (X,Y), find nodes, then find relevant edges). The
>great advantage in this case is that you can use the full power of
>pgpointcloud to create meaningfull patches (of variables size for instance,
>see this
><http://www.sciencedirect.com/science/article/pii/S092427161630123X>, 3.1,
>Adapt patch grouping rules ), which means scalling in the range of billions
>of points possible.
Thanks, that's good to know, but our Zmax-Zmin can vary.

>  - you could try to think of using the dual of your current graph (edges
>become nodes, nodes become edges).

Why could this be an advantage here? I guess if there are helpful data
structures for a dual representation, search time can be improved.

> * Scenario B.1.2 : edges are only logical, but nodes don't have a constant
>Zmax-Zmin. : you could insert (virtual) nodes so you are again in the same
>case as scenario B.1.1

I would prefer not doing this. We did this in the past, but wanted to
reduce the number of nodes needed and improve usability with large distances.

> * Scenario B.2 : edges are actual neurone path.
>  - you could construct intermediary objects (groups of neurone) of
>variable size (those within cubes), so as to have again a 2 steps query :
>first look for groups of neuron, then for neurons inside

This sounds similar to the grid I have implemented at the moment. So far
I was not able to test the update performance of this. In the worst case
this requires asynchronous updates that need to be clever about
locking.

> * Scenario C : your data don't have obvious sparsity.
>  - use the same strategy you currently use, but scale by partitionning
>your data into numerous tables.
>   You can see the partitionning entry of the postgres manual.
><https://www.postgresql.org/docs/current/static/ddl-partitioning.html>
>   This would be much easier to create and maintain if you could partition
>on only Z dimension.
>   Note that you will need to duplicate (and then de-duplicate) some edges.

Thanks for the pointer. Partitioning sounds interesting, but similarly
to the current grid based approach, getting the CHECK constraints right
seems not so easy for different use cases. With my grid I could see
performance changes when I changed cell size. But I might try and set
this up anyway.

>It's hard to recommend anything without further information on your
>data/your requirement/your ressources.

I know and I hope I could provide enough information to make some things
clearer. This E-mail got a bit lengthy, but I wanted to address all the
points you listed. You have certainly helped me a lot already and I am
currently testing the promising looking 2D based approach (2D GIST and
range GIST) further.

Thanks again, also for the nice explanatory graphics!
Tom

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improve 3D BB query on large set of lines

Rémi Cura
Hey Tom,
glad your query is faster,
and thanks for your answers (helps my curiosity)!
Your CATMAID project is very interesting (I come from a computer vision/graphics background).

Index on 2D + index on range is the way to go then!

It is faster because you manually indicate that your data can be explained in 2+1 dimension
which in your case fits the data well (from what I understood, medical imaging is often physically sensed in layers along a main axis, like in .).
3D index are based on R-Tree, which are difficult to balance & build in 2D, and even more in 3D.

 * Your server conf/hard looks super fine, I had to ask because sometime people forget to tune ^^
 * If you get tired of the cast to numeric, you could create a custom "floatrange" as explained in the doc.
 * the dual would be interesting because when looking for the original edges, you could look for points in the dual, simplifying partitioning/indexing a lot. This was just a remark I don't think you need it.
 * introducing new nodes is a bad idea indeed, I put it for completeness
 * Because of Rtree, your grid approach can have some cells that overlaps (you relax the strict partitioning of space), which would make this easy to write without locking considerations (which are a super pain to deal with.
 * You could partition by numrange(Z). The key idea would be inheritance and partial partitioning on Z
  - you use table inheritance : treenode_edge would have child tables treenode_edge_N. Each children table contains a defined numrange on Z:  table1(z=0 to z=10), table2(z=8 to 20). You can overlap (or not)
  - you don't perform a full partition :
   * you put your edges into the best partition if they entirely fit in it (edge(2,6) would go in table1 )
   * the edges that don't fit entirely in one table (for instance edge(5,22)), you keep them in treenode_edge
  - this could actually also help your slice by X and Y
  - inserting would be easy, updating quite easy.
  - You wouldn't have complex lock issues.

I don't think you need partitioning for the moment, but if you want to increase the number of nodes from 50 millions to several hundreds or thousands of millions, you'll need it (or switch to MonetDB)!

All the best for this project,
Cheers,
Rémi-C

 


2017-01-21 1:18 GMT+01:00 Tom Kazimiers <[hidden email]>:
Hi Rémi,

Thanks a lot for your thorough reply. Especially scenario A.1 has helped me to improve performance by over an order of magnitude for our most common use case. My reply got a bit longer, sorry, I wanted to provide a clearer picture of my setup. For further tests I changed the system, which is an exact mirror of my previous test setup, but this one has SSDs. The generated query plan is the same, but the timing is of course a little bit bitter. But for completeness, this is my original query now:

 SELECT te.id
 FROM treenode_edge te
 WHERE te.edge &&& 'LINESTRINGZ( -537284.0 912464.0 131705.0,
                                 1456444.0  -340912.0 131670.0)'
  AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
   'LINESTRING( -537284.0 -340912.0 131687.5, 1456444.0 -340912.0 131687.5,
                1456444.0  912464.0 131687.5, -537284.0  912464.0 131687.5,
                -537284.0 -340912.0 131687.5)')), 17.5)
  AND te.project_id = 1;

Like before, I use &&& to get all edges that have a bounding box that
intersects with the query bounding box. ST_3DDWithin is used to lower
number of false positives, i.e. edges with an intersecting BB, but that
don't intersect the query BB. The query returns 6799 rows and runs in
about 600ms:

 Bitmap Heap Scan on treenode_edge te  (cost=77.04..6555.06 rows=112 width=8) (actual time=595.209..614.640 rows=6799 loops=1)
   Recheck Cond: (edge &&& '01020000800200000000000000886520C100000000A0D82B4100000000C8130041000000003C39364100000000C0CE14C100000000B0120041'::geometry)
   Filter: ((edge && '0103000080010000000500000000000000AB6520C10000000006CF14C100000000B012004100000000AB6520C100000000C3D82B4100000000B0120041000000804D39364100000000C3D82B4100000000C8130041000000804D3936410000000006CF14C100000000C813004100000000AB6520C10000000006CF14C100000000B0120041'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry, '17.5'::double precision))
   Rows Removed by Filter: 26
   Heap Blocks: exact=4120
   Buffers: shared hit=24703
   ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..77.01 rows=1679 width=0) (actual time=594.444..594.444 rows=6825 loops=1)
         Index Cond: (edge &&& '01020000800200000000000000886520C100000000A0D82B4100000000C8130041000000003C39364100000000C0CE14C100000000B0120041'::geometry)
         Buffers: shared hit=20583
 Planning time: 0.661 ms
 Execution time: 615.102 ms

The bitmap index scan on the ND GIST index is still the major problem.

On Thu, Jan 19, 2017 at 03:40:47PM +0100, Rémi Cura wrote:
- hardware : are your tables on SSD, does your index fit in RAM

Our production and test system have the following setup, it is mainly used for the database and a web-server (no other heavy processes, barely any load).

- 24 x Intel Xeon CPU E5-2630 v2 @ 2.60GHz
- 128GB RAM (all tables and indices fit into it)
- LSI MegaRAID SAS 2208, RAID 1, 1TB total, built from 2 x 1 TB Samsung  SSD 850 PRO 1TB (this is where the Postgres data directory is)

The Postgres data directory has a size of 40GB and the size of one complete dump of the database of interest is about 5GB at the moment. Of course we made sure we mounted the partition with the noatime option and we use the ext4 filesystem.

- postgres tuning : you have customised postgresql.conf to fit your
hardware

Yes, we modified it a bit. This is mainly what we changed:

max_connections = 250
shared_buffers = 16GB
work_mem = 256MB
maintenance_work_mem = 2GB
wal_buffers = 16MB
effective_cache_size = 16GB
default_statistics_target = 1000

These settings worked well so far for us on most queries. But I wonder if less connections and more work_mem might be useful.

For the sake of completeness, we also tried the parallel worker feature of
Postgres 9.6, but couldn't see any query using it (since no sequential were used in the first place, I believe).

- query writing : your query is optimally written (looks fine here)

Thanks, this improves my confidence that I didn't miss anything obvious on the query level. :-)

- data model (detailed after)

I add some more details below, but first I want to add some more
context: typically we deal with individual nodes, stored in a separate
treenode table (the result of the PostGIS query is then joined into it).
Each node has a parent (except the root node) and the edge between them
is an actual part of the neuron (matches underlying image data). It will always be a straight line, though. This is also why we need to respect edges during queries: we need to know if the edge between two nodes intersect with a query bounding box, even if the nodes themselves aren't. Edges often cross multiple query bounding boxes/sections, i.e.  nodes are multiple query bounding boxes/sections away. Users create new nodes and modify or delete old ones all the time.

We query typically in sections of Z, but support also queries in sections of X or Y. Ideally, all are equally fast. Z-sections are, however, the clear focus.

IF you want fast results, you need a good indexing, and this is possible if you exploit the sparsness of your data. So where is the sparsness?

The neurons/trees can cover quite a bit of space. There are many really big ones that cross hundreds of slices. In my current test data set we have about 10.000 neurons, with an average of 6000 nodes each. About 50.000 nodes get added per day. Most neurons span between 1000 and 2500 sections. An typical section we want to improve performance for might involve 5000-10000 edge intersections which might belong to 1000-2000 different neurons/trees, but this also varies a lot by region.

* Scenario A.1: your usual slices affects a majority of trees
 - you don't really need an index on X,Y, but mainly on Z.
   The idea is to separate your index into index(X,Y) + index(Z)
    Try adding an index on GIST(numrange(ST_Zmin,ST_ZMax)) (you could add
it anyway)
    and an index on GIST(geom) (aka only 2D).

As long as I can also query from orthogonal perspectives with reasonable performance, I don't need separate X and Y indices, it seems.

First, I adding the following indices:

 CREATE INDEX treenode_edge_z_range_gist ON treenode_edge
 USING GIST(numrange(ST_ZMin(edge)::numeric, ST_ZMax(edge)::numeric), '[]');

 CREATE INDEX treenode_edge_2d_gist ON treenode_edge USING GIST(edge);

Without the type casts, Postgres wouldn't allow it and inclusive bounds
are the current behavior. My regular query of course doesn't make use of
this index in its current form, but replacing the &&& operator test with
a 2D bound box check plus a Z index condition turns out to be really
fast:

 EXPLAIN (ANALYZE, BUFFERS) SELECT te.id
 FROM treenode_edge te
 WHERE te.edge && ST_MakeEnvelope(-537284.0, -340912.0, 1456444.0, 912464.0)
   AND numrange(ST_ZMin(te.edge)::numeric, ST_ZMax(te.edge)::numeric) && '[131670.0,131705.0]'::numrange
   AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_GeomFromText(
     'LINESTRING( -537284.0 -340912.0 131687.5, 1456444.0 -340912.0 131687.5,
                  1456444.0  912464.0 131687.5, -537284.0  912464.0 131687.5,
                  -537284.0 -340912.0 131687.5)')), 17.5)
   AND te.project_id = 1;
 ---
 Bitmap Heap Scan on treenode_edge te  (cost=3398.97..123260.14 rows=5007 width=8) (actual time=12.919..33.256 rows=6799 loops=1)
   Recheck Cond: (numrange((st_zmin((edge)::box3d))::numeric, (st_zmax((edge)::box3d))::numeric, '[]'::text) && '[131670.0,131705.0]'::numrange)
   Filter: ((edge && '0103000000010000000500000000000000886520C100000000C0CE14C100000000886520C100000000A0D82B41000000003C39364100000000A0D82B41000000003C39364100000000C0CE14C100000000886520C100000000C0CE14C1'::geometry) AND (edge && '0103000080010000000500000000000000AB6520C10000000006CF14C100000000B012004100000000AB6520C100000000C3D82B4100000000B0120041000000804D39364100000000C3D82B4100000000C8130041000000804D3936410000000006CF14C100000000C813004100000000AB6520C10000000006CF14C100000000B0120041'::geometry) AND (project_id = 1) AND ('0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry && st_expand(edge, '17.5'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000886520C100000000C0CE14C1000000003C130041000000003C39364100000000C0CE14C1000000003C130041000000003C39364100000000A0D82B41000000003C13004100000000886520C100000000A0D82B41000000003C13004100000000886520C100000000C0CE14C1000000003C130041'::geometry, '17.5'::double precision))
   Rows Removed by Filter: 26
   Heap Blocks: exact=4120
   Buffers: shared hit=4428
   ->  Bitmap Index Scan on treenode_edge_range_gist  (cost=0.00..3397.72 rows=75374 width=0) (actual time=12.351..12.351 rows=6825 loops=1)
         Index Cond: (numrange((st_zmin((edge)::box3d))::numeric, (st_zmax((edge)::box3d))::numeric, '[]'::text) && '[131670.0,131705.0]'::numrange)
         Buffers: shared hit=308
 Planning time: 0.856 ms
 Execution time: 33.731 ms

This is a very impressive speed-up, thanks for this suggestion! For all
examples I tried the results were exactly the same as with the &&&
operator. I only stumped across the fact the current semantic is to
have inclusive bounds, but numrange()'s behavior is '[)' by default.  Being able to change this is also a nice feature.

I also tested creating range indices for X and Y so that I could replace
the &&& operator with three separate range checks. This, however, was
much slower with >2500ms. This setup would have been nicer, because it could work equally well for "slice" queries in each dimension.

Anyhow, using your suggested indices and query works very well and fast even for differently oriented slices. It is really suprising that the two 2D index is faster than a 3D index for us.

* Scenario A.2 : your usual slices affect few trees
 - you could easily perform a 2 part query, where the first part use an
index on tree bounding box, and the second perform the query on edge. This
require to create and maintain a tree_bounding_box table with appropriate
indexes, which is quite easy with triggers Ex (not optimised):
  WITH filtering_by_tree AS (
    SELECT distinct tree_id
    FROM my_sync_table_of_tree_bounding_boxes as tree
    WHERE tree.bbox &&& your_slice
  ) SELECT edge_id
    FROM treenode_edge  AS te
    WHERE te.geom &&& your_slice
     AND EXISTS (SELECT 1 FROM filtering_by_tree As ft WHERE ft.tree_id =
te.tree_id)
 - you could use pgpointcloud to similar effect by construction a patch
per tree.

This is certainly also an appealing idea. I quickly tested this by
creating a neuron summary table with its current bounding boxes as box3d
and then queried in a similar way like you suggested above. I couldn't
get this very fast though, regardless of whether &&& or a 2D-approach is
in use. I typically ended up with about 500ms.

Now something is very ambiguous in your data, and it could have major
consequences :
* Are the edges between nodes that represent neuron __only logical__ (they
are straight lines, always the same Zmax-Zmin, and represent a conceptual
graph), or are they polylines that represent the __actual__ neurone path?

Edges reflect the morphology of the actual neuron and their length
(even only in Z) can vary a lot.

* Scenario B.1.1 : edges are only logical and have a constant Zmax-Zmin,
i.e. nodes have quantified Z :
 - you don't need edges at all for indexing, you should work only on nodes
(i.e. index nodes IN Z + (X,Y), find nodes, then find relevant edges). The
great advantage in this case is that you can use the full power of
pgpointcloud to create meaningfull patches (of variables size for instance,
see this
<http://www.sciencedirect.com/science/article/pii/S092427161630123X>, 3.1,
Adapt patch grouping rules ), which means scalling in the range of billions
of points possible.

Thanks, that's good to know, but our Zmax-Zmin can vary.

 - you could try to think of using the dual of your current graph (edges
become nodes, nodes become edges).

Why could this be an advantage here? I guess if there are helpful data structures for a dual representation, search time can be improved.

* Scenario B.1.2 : edges are only logical, but nodes don't have a constant
Zmax-Zmin. : you could insert (virtual) nodes so you are again in the same
case as scenario B.1.1

I would prefer not doing this. We did this in the past, but wanted to
reduce the number of nodes needed and improve usability with large distances.

* Scenario B.2 : edges are actual neurone path.
 - you could construct intermediary objects (groups of neurone) of
variable size (those within cubes), so as to have again a 2 steps query :
first look for groups of neuron, then for neurons inside

This sounds similar to the grid I have implemented at the moment. So far
I was not able to test the update performance of this. In the worst case this requires asynchronous updates that need to be clever about locking.

* Scenario C : your data don't have obvious sparsity.
 - use the same strategy you currently use, but scale by partitionning
your data into numerous tables.
  You can see the partitionning entry of the postgres manual.
<https://www.postgresql.org/docs/current/static/ddl-partitioning.html>
  This would be much easier to create and maintain if you could partition
on only Z dimension.
  Note that you will need to duplicate (and then de-duplicate) some edges.

Thanks for the pointer. Partitioning sounds interesting, but similarly to the current grid based approach, getting the CHECK constraints right seems not so easy for different use cases. With my grid I could see performance changes when I changed cell size. But I might try and set this up anyway.

It's hard to recommend anything without further information on your
data/your requirement/your ressources.

I know and I hope I could provide enough information to make some things clearer. This E-mail got a bit lengthy, but I wanted to address all the points you listed. You have certainly helped me a lot already and I am
currently testing the promising looking 2D based approach (2D GIST and range GIST) further.

Thanks again, also for the nice explanatory graphics!
Tom

_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/postgis-users