h4ck3rm1k3's Comments
Post | When | Comment |
---|---|---|
Very fast osm processing in C++ | OK, I have a perl script to generate xml constants here:
A scheme file here:
The latest version has a makefile and also I have generated a list of fields:
This is just a first version, will need to put more work into creating an optimum recogniser for the schema. It should be possible to generate a lex like structure to process the rest. but for now, I am doing switches based on the field names. Now, this version looks up each node reference in the id -> coords table and also outputs the entire names database of the nodes, ways and relations. it runs in 10 seconds on my computer with a larger version of the osm file with some duplicates where i tried to resolve the missing nodes in the extract file.
for comparison, wordcount needs 5x less.
So, it is still fast even though it is doing much more processing.
I am going to make some template classes for the processing of fields and defining structures... here is a start that I have not even compiled :
|
|
Very fast osm processing in C++ | Yes, I am rewriting that perl script in c++ now,
I dont want to collect any huge memory structure in the parser, the client should be able to do that. mike |
|
New version of osm2poly.pl to extract from the cloudmade admin borders | The upload is finished :
|
|
Polygon files for NJ ZCTA on the way | Here is the second part!
|
|
Polygon files for NJ ZCTA on the way | You can see the difference between the zcta and the "zip codes" Here are the ztcas :
There are differences that you can see between the two versions.
But we have to start somewhere! |
|
Polygon files for NJ ZCTA on the way | The first part is finished uploading :
|
|
Polygon files for NJ ZCTA on the way | I found a mashup that shows just want I am planning on doing,
Here are more infos on zipcodes :
So if anyone wants to add any information about them, do it there. mike |
|
Hacking the OSM tools today Osm2PgSql and Osm2Poly | http://zip4.usps.com/zip4/welcome.jsp Here is a nice tool to double check the zip code if there are any questions. |
|
New Host for OSM data , archive.org | I have been playing with qgis, and it looks like there is a feature to create a convex hull based on an attribute value. So, you could take these attribute values (post codes) and create a convex hull and then compare this to the ZCTA. That would give you a good start because you could compare the areas that have the biggest difference first. The other thing is that you can flag the nodes and ways that are outside of the ZCTA, that is what I was doing to check them. Maybe other states have more problems with the zipcodes, but NJ looks very stable. mike |
|
New Host for OSM data , archive.org | I was just following the wiki,
|
|
New Host for OSM data , archive.org | Yes, of course. In Germany I found power lines, security cameras and trees.
|
|
New Host for OSM data , archive.org | I have hacked osm2pgsql so that it imports the data from my feeds :
The data is loaded in qgis. I will be creating some postgres queries to split up the data and process it. that is at least my plan. I dont care if the monolithic OSM database stores this data or not. In fact, I think it would be better to keep it separate until we find a better way to add in layers. Ideally the chunks of data will be usable directly from some GIT repository and we split them into very small but useful peices. mike |
|
New Host for OSM data , archive.org | yes, Well we will be able to check them all out.
If we find data that does not match or is crossing the border then it can be split up or marked for being manually fixed Given a hierarchy of data, then we would match it based on the attributes of the EPA datapoints. Does the county match the county from tiger? Does the zipcode match the zipcode from the census. The census said that they will not update this data, but we can. Given enough test data (zip code attributes) we can find all the ones that break the model and fix it. Anyway, there is a huge market for this type of processing and I think that OSM or something like it is the right way to go. I will not commit this data to osm, but keep the osm files on archive.org if we get enough updates, we can out them into a git repository... I am starting to think that the monster database idea is not a very good one anyway.. mike If it turns out that the zip code from the zcta produces bad data, |
|
New Host for OSM data , archive.org | Yes, I have been looking at the data. There are cases were the boundries are not exactly matching, this will have to all be reviewed. My idea is to make a program to look for containment hierarchies in the data, this region contains this one and to flag errors... mike |
|
New Host for OSM data , archive.org | Yes, We have two levels. 3 digit ones and 5 digit ones.
Of course I can import them.... But I will first send a mail to the list.
|
|
EPA Bulk Import | I am going to remove all the data that has not been updated by someone manually.
mike |
|
open letter to the EPA | it is not junk, if you think it is junk then revert the changesets and we dont need to talk about it anymore. With a modify command, I will modify the data and adjust it. But I think the data is still usable as it is, not perfect but a good start. mike |
|
open letter to the EPA | I have started a wikipage. osm.wiki/EPAGeospatial please add in your comments there. |
|
EPA Bulk Import | Hi tomh, I understand you concerns. We will see how the community reacts. I have gotten mixed messages. mike |
|
Next Project for the EPA and Mine data | Well, of course. I am thinking about just using a standard module. there are other things to do with these nodes : 1. looking for duplicates (pre existing)
|