OpenStreetMap-logo OpenStreetMap

screenscraping

Skrevet af h4ck3rm1k3 den 8 august 2009 på English

here is a broken webpage from the government of kosovo that is supposed to have property information.

The problem is that it is horrible, nothing works.

Here is the list of properties for rent in one area:
http://www.kpaonline.org/ReadyForRentMun/ReadyForRentMun.aspx?mun=Dakovica

Here is the list of properties for in adminstration
http://www.kpaonline.org/AdminMun.asp?mun=Dakovica

you need to search with the serbian names in google, not on the site, cause it does not work.

Search String :Gjakovë/Djakovica site:kpaonline.org

http://www.google.com/search?hl=en&hs=tJ2&q=Gjakov%C3%AB%2FDjakovica+site%3Akpaonline.org&

you can then get the data like this :

wget -r -l4 -D www.kpaonline.org http://www.kpaonline.org/AdminMun.asp?mun=Dakovica

E-mail-ikon Bluesky-ikon Facebook-ikon LinkedIn-ikon Mastodon-ikon Telegram-ikon X-ikon

Diskussion

Kommentar fra h4ck3rm1k3 skrevet 8. august 2009 kl. 09:47

So Kosovo is 34T/North

http://home.hiwaay.net/~taylorc/toolbox/geography/geoutm.html

for this property :
http://www.kpaonline.org/resultPAmun.asp?IS=DS605487
GPS Grid UTM 0453384/4692348

The webpage says:
20.433725867427746/42.38201849774153

Kommentar fra h4ck3rm1k3 skrevet 8. august 2009 kl. 09:50

That is
42deg 22' 55"N, 20deg 26' 1.4"E

Kommentar fra h4ck3rm1k3 skrevet 8. august 2009 kl. 09:56

here is my first point
osm.org/browse/changeset/2073369

Kommentar fra h4ck3rm1k3 skrevet 8. august 2009 kl. 10:30

here are my commands :
grep -h -A2 GPS resultPAmun.asp\?IS\=* | sort -u | grep ^04 | cut -d\< -f1 | sed -e 's;\/; ;g' > points.txt

cs2cs -E +proj=utm +zone=34T +units=m +proj=tmerc -f "%.9f" < points.txt > convert.txt
perl ~/Desktop/maps/openstreetmapkosova/convert2osm.pl convert.txt > new.osm

There were two data errors in the input :
0477588 4673224 20.798029599 4.281299370 0.000000000
0488275 4668363 20.868633114 36.760955805 0.000000000

my convert script :

use strict;

use warnings;

print q[

];

my $id=0;
while (<>)
{

if (/^\s*(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s*$/)
{
$id--;
print qq[];
}
else
{
die "error $_";
}

}

print q[];

Log ind for at kommentere