cello's hozzászólásai

Bejegyzés	Időpont	Hozzászólás
A New File Format for OSM Data	5 hónappal ezelőtt	Do you know GeoDesk (https://www.geodesk.com)? I think they are trying to achieve something similar. They also have a custom file format that groups osm-data by region for faster access. But their files are actually bigger than the originating pbf-files, and not smaller as yours are.
A New File Format for OSM Data	5 hónappal ezelőtt	Very impressive design of essentially a query-optimized database for OSM data. Thanks a lot for your efforts and making it public! You currently have an option to compress parts of the file (`zip_chunks` in the source code), which uses Java’s `DeflaterOutputStream`. Essentially, this uses the compression also used by `gzip`. While it is simple to use as it is directly included by Java, the deflate-compression algorithm is known to be pretty slow and have a low throughput. More modern compression algorithms would be ZStandard (https://facebook.github.io/zstd/) or LZ4 (https://github.com/lz4/lz4-java), which are much faster for both compression and decompression, and Zstd might even result in better compression than the default deflate. While the library might lose a bit of its appeal as it will no longer be self-contained but require some dependencies on other code, I think it might be worth it by becoming even faster and creating even smaller files. So, the general feedback from my lines above might be: - include some additional bits or bytes in the header for future needs, just to future proof your format - do not only have 1 bit for compression = true\|false, but maybe 3 bits (giving values 0 – 7): 0=uncompressed, 1=deflate, 2=zstandard, 3=lz4, 4-7=future use
Using the Oma Library	5 hónappal ezelőtt	Very nice, and very impressive from a performance point of view! I think it looks very promising and I love that there is already a Java API/Library for this! As you requested feedback in your first blog post, here are some of my thoughts as a software developer (only after looking at your posts, I have not yet tested the library): TypeFilter r.setFilter(new AndFilter(power_of_town, new TypeFilter(“A”))); while (true) { … } In the `TypeFilter`, what are possible parameter values? `A` (area), `W` (way), `N` (node). Are there others? What happens if I pass `a`, `Q` or `#` or `?`? Instead of taking a `char` as argument, you probably could use an enum with a fixed set of options Multiple Queries with same reader In your example, you seem to re-use your reader `r` 3 times, each time setting a filter and then just calling `r.next()`. For me, this is a bit confusing. Traditionally in Java, I have a query method or a method returning an iterator, and then I can iterate over the found values (e.g. with `iterator.hasNext(); iterator.next()`. In your code, the reader has a `next()` method that seems to automatically reset when you set a filter. But this “reset” is not really visible in the code. Maybe something the following would be a nicer API design? `OmaIterator iter = r.query(new AndFilter(power_of_town, new TypeFilter("A"))); while (true) { Object o = iter.next(); if (o == null) break; .... }` Although then people might try to run two queries in parallel on the same reader, and I don’t know if that is supported or not. I think what confuses me is that I don’t see where/when the query happens: I set a filter, and suddenly I can access `next` to get results. This does not seem intuitive to me (but others might have a different opinion on that)

cello's hozzászólásai

TypeFilter

Multiple Queries with same reader