h4ck3rm1k3's Comments

Post	When	Comment
Very fast osm processing in C++	over 15 years ago	OK, I have a perl script to generate xml constants here: http://bazaar.launchpad.net/%7Ejamesmikedupont/%2Bjunk/EPANatReg/annotate/head%3A/makenames.pl A scheme file here: http://bazaar.launchpad.net/%7Ejamesmikedupont/%2Bjunk/EPANatReg/annotate/head%3A/schema.txt The latest version has a makefile and also I have generated a list of fields: http://bazaar.launchpad.net/%7Ejamesmikedupont/%2Bjunk/EPANatReg/annotate/head%3A/OSMAttributes.h This is just a first version, will need to put more work into creating an optimum recogniser for the schema. It should be possible to generate a lex like structure to process the rest. but for now, I am doing switches based on the field names. Now, this version looks up each node reference in the id -> coords table and also outputs the entire names database of the nodes, ways and relations. it runs in 10 seconds on my computer with a larger version of the osm file with some duplicates where i tried to resolve the missing nodes in the extract file. real 0m10.667s for comparison, wordcount needs 5x less. time wc lint.osm 393773 1974640 30893704 lint.osm real 0m1.896s So, it is still fast even though it is doing much more processing. I think this is a real winner folks. I am going to make some template classes for the processing of fields and defining structures... here is a start that I have not even compiled : http://bazaar.launchpad.net/%7Ejamesmikedupont/%2Bjunk/EPANatReg/annotate/head%3A/Field.h
Very fast osm processing in C++	over 15 years ago	Yes, I am rewriting that perl script in c++ now, In the end you will be able to define filters on what attributes you want to collect and the get them in a callback. I dont want to collect any huge memory structure in the parser, the client should be able to do that. mike
New version of osm2poly.pl to extract from the cloudmade admin borders	over 15 years ago	The upload is finished : http://www.archive.org/details/NJ_Counties
Polygon files for NJ ZCTA on the way	over 15 years ago	Here is the second part! http://www.archive.org/details/ZCTA_NJ2
Polygon files for NJ ZCTA on the way	over 15 years ago	You can see the difference between the zcta and the "zip codes" Here are the ztcas : http://maps.huge.info/zcta.htm There are differences that you can see between the two versions. I understand the panic better now. But we have to start somewhere!
Polygon files for NJ ZCTA on the way	over 15 years ago	The first part is finished uploading : http://ia341335.us.archive.org/2/items/ZCTA_NJ/ 07001- 07878
Polygon files for NJ ZCTA on the way	over 15 years ago	I found a mashup that shows just want I am planning on doing, http://maps.huge.info/zip.htm Here are more infos on zipcodes : http://en.wikipedia.org/wiki/ZIP_Code_Tabulation_Area http://en.wikipedia.org/wiki/ZIP_code So if anyone wants to add any information about them, do it there. mike
Hacking the OSM tools today Osm2PgSql and Osm2Poly	over 15 years ago	http://zip4.usps.com/zip4/welcome.jsp Here is a nice tool to double check the zip code if there are any questions.
New Host for OSM data , archive.org	over 15 years ago	I have been playing with qgis, and it looks like there is a feature to create a convex hull based on an attribute value. So, you could take these attribute values (post codes) and create a convex hull and then compare this to the ZCTA. That would give you a good start because you could compare the areas that have the biggest difference first. The other thing is that you can flag the nodes and ways that are outside of the ZCTA, that is what I was doing to check them. Maybe other states have more problems with the zipcodes, but NJ looks very stable. mike
New Host for OSM data , archive.org	over 15 years ago	I was just following the wiki, personally I would like zipcode, but here :osm.wiki/Key:postal_code
New Host for OSM data , archive.org	over 15 years ago	Yes, of course. In Germany I found power lines, security cameras and trees. But we still need a better staging system. Why should we just throw it all into a single database. We could have many databases for various layers. This is a design issue. In fact, why do we need a monster database at all? Cannot we deal with lots of small files and a smart editor that commits them in the right way so that we don't need anything more than a smart distributed version control system?
New Host for OSM data , archive.org	over 15 years ago	I have hacked osm2pgsql so that it imports the data from my feeds : http://fmtyewtk.blogspot.com/2009/12/osm2pgsql-hack-for-importing-id-ways.html The data is loaded in qgis. I will be creating some postgres queries to split up the data and process it. that is at least my plan. I dont care if the monolithic OSM database stores this data or not. In fact, I think it would be better to keep it separate until we find a better way to add in layers. Ideally the chunks of data will be usable directly from some GIT repository and we split them into very small but useful peices. mike
New Host for OSM data , archive.org	over 15 years ago	yes, Well we will be able to check them all out. My plan is to create a hierarchy of data, where each region (state) that contains another region(county) and so forth (relations and ways that contain each other). If we find data that does not match or is crossing the border then it can be split up or marked for being manually fixed Given a hierarchy of data, then we would match it based on the attributes of the EPA datapoints. Does the county match the county from tiger? Does the zipcode match the zipcode from the census. The census said that they will not update this data, but we can. Given enough test data (zip code attributes) we can find all the ones that break the model and fix it. Anyway, there is a huge market for this type of processing and I think that OSM or something like it is the right way to go. I will not commit this data to osm, but keep the osm files on archive.org if we get enough updates, we can out them into a git repository... I am starting to think that the monster database idea is not a very good one anyway.. mike If it turns out that the zip code from the zcta produces bad data,
New Host for OSM data , archive.org	over 15 years ago	Yes, I have been looking at the data. There are cases were the boundries are not exactly matching, this will have to all be reviewed. My idea is to make a program to look for containment hierarchies in the data, this region contains this one and to flag errors... mike
New Host for OSM data , archive.org	over 15 years ago	Yes, We have two levels. 3 digit ones and 5 digit ones. the 3 digits contain the 4 digits. Of course I can import them.... But I will first send a mail to the list. mike
EPA Bulk Import	over 15 years ago	I am going to remove all the data that has not been updated by someone manually. I am working atm on downloading and processing all the points and will setup a separate hosting for the datafiles. mike
open letter to the EPA	over 15 years ago	it is not junk, if you think it is junk then revert the changesets and we dont need to talk about it anymore. With a modify command, I will modify the data and adjust it. But I think the data is still usable as it is, not perfect but a good start. mike
open letter to the EPA	over 15 years ago	I have started a wikipage. osm.wiki/EPAGeospatial please add in your comments there.
EPA Bulk Import	over 15 years ago	Hi tomh, I understand you concerns. We will see how the community reacts. I have gotten mixed messages. mike
Next Project for the EPA and Mine data	over 15 years ago	Well, of course. I am thinking about just using a standard module. there are other things to do with these nodes : 1. looking for duplicates (pre existing) 2. looking for out of date information. 3. looking for better ways to render.