A project I worked on, the Toronto Public Washroom Import recently finished up and I thought it would be interesting to do a quick lessons learned:
Top three things that went well:
- Having help from other people makes things much easier — I am grateful to RockTeam for helping with the changesets and for Jarek_Piórkowski and a couple members of the Civic Tech Toronto community for their input on the import plan.
- Doing a couple “test” imports and recording an instruction video significantly improved the written import plan and instructions. It’s also very helpful to have mapped a couple of the relevant features via survey as well.
- For the script that converted the City data to OSM tags, setting and validating assumptions about the City data using Pandera helped give me a lot of confidence that the output wouldn’t be affected by upstream changes to the City data.
Top three things I would change if I did it again:
- Increased the time assumption for how long it would take to complete all the changesets. I had originally expected it would take two weeks at most, but it took closer to three (even with two people doing changesets).
- For the initial data profiling, I might try using e.g. Tableau public, since it would be a little faster and easier than e.g. using
.value_counts(dropna=False)
in pandas. - Have the data transformation script estimate how many washrooms should be conflated vs. net new in each changeset - this would provide a simple comparable metric to ensure that the conflation plugin was configured correctly for each changeset.