arcpy: Efficient Geometry Creation
Sometimes every second counts… and even if it doesn’t, it’s still interesting to see the quirks of a familiar library.
It turns out that object creation can be somewhat expensive (especially when you’re talking about Python –> ArcObjects –> COM). With arcpy (and underlying ArcObjects), there are some objects which can be reused to gain some efficiency.
An interesting example is simply creating a polyline from a pair of points.
Example: Creating a 2 point polyline
Simplest form:
In its simplest form, you may write:
def create_line_simple(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
start_point = arcpy.Point(*point1_x_and_y)
end_point = arcpy.Point(*point2_x_and_y)
array = arcpy.Array([start_point, end_point])
polyline = arcpy.Polyline(array, spatial_ref)
return polyline
# Usage:
polyline = create_line_simple((-81.60, 36.20), (-81.70, 36.30), arcpy.SpatialReference(4326))
There’s nothing wrong with this code. In fact, if you’re only creating a few polylines, stop here. It’s readable and gets the job done.
A little more efficient:
However, if you’re creating thousands of polylines, some time can be saved by reusing arcpy.Point objects.
# modules scoped private variables
_start_point = arcpy.Point()
_end_point = arcpy.Point()
def create_line_reuse_points(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
_start_point.X, _start_point.Y = point1_x_and_y
_end_point.X, _end_point.Y = point2_x_and_y
array = arcpy.Array([_start_point, _end_point])
polyline = arcpy.Polyline(array, spatial_ref)
return polyline
In this case, we’re creating 2 module scoped points only once and then setting the X and Y properties on those points. The arcpy.Polyline constructor reads X and Y from those points, but it doesn’t maintain a reference to the points. Setting properties on the existing objects is a bit more efficient than creating new objects every time and since references aren’t maintained to those objects, we’re safe from a memory perspective.
Even more efficient:
Why not go ahead and reuse the arcpy.Array as well? Once again, arcpy.Polyline() only reads data from the array and doesn’t maintain a reference. Make sure to removeAll() from array to clean up.
# modules scoped private variables
_start_point = arcpy.Point()
_end_point = arcpy.Point()
_array = arcpy.Array()
def create_line_reuse_points_array(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
_start_point.X, _start_point.Y = point1_x_and_y
_end_point.X, _end_point.Y = point2_x_and_y
_array.add(_start_point)
_array.add(_end_point)
polyline = arcpy.Polyline(_array, spatial_ref)
_array.removeAll()
return polyline
How much more efficient is this approach?
Here are the results (in seconds) for creating 100,000 polylines with each function (Python 3.4.1 w/ ArcGIS Pro on Core i7-4712HQ):
Create line simple:
0:00:21.071529
Create line reuse points:
0:00:17.813275
Create line reuse points and array:
0:00:16.277035
Is it a huge difference? Not really. But if you have a process that creates a large amount of geometries, it’s worth considering reusing a few objects.
Here’s the full test script to produce the above results:
import arcpy
from datetime import datetime as dt
def time_me(n):
""" decorator to print total time to run function n number of times """
def time_me_decorator(f):
def wrapper(*args):
start = dt.now()
for _ in range(n):
f(*args)
print(dt.now() - start)
return wrapper
return time_me_decorator
REPETITIONS = 100000
######## Simple Case
@time_me(REPETITIONS)
def create_line_simple(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
start_point = arcpy.Point(*point1_x_and_y)
end_point = arcpy.Point(*point2_x_and_y)
array = arcpy.Array([start_point, end_point])
polyline = arcpy.Polyline(array, spatial_ref)
return polyline
######## Reuses the point objects
# modules scoped private functions
_start_point = arcpy.Point()
_end_point = arcpy.Point()
@time_me(REPETITIONS)
def create_line_reuse_points(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
_start_point.X, _start_point.Y = point1_x_and_y
_end_point.X, _end_point.Y = point2_x_and_y
array = arcpy.Array([_start_point, _end_point])
polyline = arcpy.Polyline(array, spatial_ref)
return polyline
######## Reuses the point and array objects
# modules scoped private functions
_start_point = arcpy.Point()
_end_point = arcpy.Point()
_array = arcpy.Array()
@time_me(REPETITIONS)
def create_line_reuse_points_array(point1_x_and_y, point2_x_and_y, spatial_ref):
""" creates polyline from pair of (x,y) tuples """
_start_point.X, _start_point.Y = point1_x_and_y
_end_point.X, _end_point.Y = point2_x_and_y
_array.add(_start_point)
_array.add(_end_point)
polyline = arcpy.Polyline(_array, spatial_ref)
_array.removeAll()
return polyline
# Run our tests
if __name__ == "__main__":
WGS_84 = arcpy.SpatialReference(4326)
POINT1 = (-81.674525, 36.216630)
POINT2 = (-81.675351, 36.213886)
print("Create line simple:")
create_line_simple(POINT1, POINT2, WGS_84)
print("")
print("Create line reuse points:")
create_line_reuse_points(POINT1, POINT2, WGS_84)
print("")
print("Create line reuse points and array:")
create_line_reuse_points_array(POINT1, POINT2, WGS_84)