Home of the Squeezebox™ & Transporter® network music players.
Page 2 of 2 FirstFirst 12
Results 11 to 17 of 17
  1. #11
    Senior Member
    Join Date
    Jan 2006
    Posts
    225
    Thank you Michael, I'm in the process of pulling down all 18,168 bugs. At about #5,176 currently.

    If anyone is interested, I could probably create a tarball and put it somewhere.

    For reference, this is the command line I'm using to also grab css etc referenced by each page and convert links to reference the local copies:

    % for b in {1000..1999} ; do echo "http://107.21.49.93/show_bug.cgi?id=$b" ; done \
    | wget -E -H -k -K -p -e robots=off -i - -w 0.5 --random-wait

    (that's for one chunk of 1000 bugs)

    Previously I had the loop calling wget on each page, but then it was doing 25 fetches for each bug, whereas this way it's grabbing all the referenced pages once and then doing one fetch for each bug.

    I will have to rename the resulting files as firefox doesn't seem to like loading a file: URL containing '?'.

    Greg

  2. #12
    Senior Member
    Join Date
    May 2017
    Posts
    580
    Awesome, thanks for your effort!
    SqueezeBoxes: 1x Transporter (Living room) 1x SB2 (shed), 1x Radio (Kitchen), 1x Boom (Dining room), 1x piCorePlayer (jacuzzi), 1x piCorePlayer (Garden) 1x OSMC + Squeezelite (Movie room), 1x Touch (Study 2), few spare unit's
    Server: LMS on Pi3 7.9.1. on PcP 3.21
    Network: AVM Fritzbox, Netgear Smart Switch 24p, 3x Ubiquity

  3. #13
    Senior Member
    Join Date
    Dec 2009
    Location
    Germany
    Posts
    359
    Note that Bugzilla has a rest api to fetch the data in a more structured (json) format.
    sent from a computer using a keyboard

  4. #14
    Senior Member
    Join Date
    Jan 2006
    Posts
    225
    Yeah I probably should have used the api if I were going to do it over. Not sure I'm motivated to figure that out, what I have is fine, I can tolerate a little inconvenience, it's not like I am likely to be using this archive regularly. I did realize I forgot to grab the attachments so I'm about half way thru that now. I can rewrite links (to other bugs, comments, attachments) in the pages if I get motivated at some point but I'll have the data in a reasonably usable format shortly.

  5. #15
    Senior Member
    Join Date
    Jan 2006
    Posts
    225
    Thank you again, Michael. I think I'm all set now.

    Greg

  6. #16
    Senior Member
    Join Date
    Nov 2012
    Posts
    219
    Did u get all the bugs, or just the "open" ones?

    In many cases, bugs were closed that shouldn't be, and moreover even if they were properly closed many had good info in them.

    Any chance the bugs could be posted as static pages and ergo not as big a risk to the server?

    Edit: where are new ones being put on github? I'm not too familiar with github.
    Last edited by BJW; 2019-10-19 at 17:34.
    Using: Win7 64 + LMS 7.9 & Duet & ipads w/the logitech app, and ipeng on an ipod
    http://wiki.slimdevices.com/index.ph..._Artists_logic & http://wiki.slimdevices.com/index.php/Compilations

  7. #17
    Senior Member
    Join Date
    Jan 2006
    Posts
    225
    Hi BJW, I grabbed everything, all bugs, and all attachments (even ones not referenced from bugs, because it was easier to just loop from 1 to 7765). They are essentially static pages now, though there are a lot of broken links (which I wasn't planning to fix). I am planning to fix cross-links (to other bugs, or specific comments of the same bug) as well as to the attachments, but not super high priority to me right now but should not take too much time. The whole thing is about 8Gb (I presume it will compress some). I don't have a place to host it but I can put a tarball somewhere either now or once I fix up the links mentioned above.

    Greg

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •