I’m a retired Unix sysadmin. Over the years I’ve built things in COBOL, FORTAN, C, perl, rexx, PHP, visual basic, various Unix shells and maybe others. Nothing has been a real “application” - mostly just utilities to help me get things done.

Now that I’m retired, and it’s cold outside, I’m curious to try some more coding - and I have an idea.

The music communities here seem to post links to YouTube. I generally use Lemmy on my phone but don’t use YouTube, or listen to music, on my phone if I can help it. I’d like to scrape a music community here and add the songs posted to a playlist in my musicbrainz account.

Does that sound like a reasonable learner project? Any suggestions for language and libraries appreciated. My preferred IDE is vim on bash and I have a home server running Linux where this could run as a daemon, or be scheduled.

  • Diabolo96@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    1
    ·
    9 months ago

    The best programming language for automating things is python. Python is easy and comes with a lot of modules that allow you to do anything and everything, I guarantee you that once you start automating stuff it’ll become like a drug and you’ll just “automate it” whenever you have anything repetitive.

    And BTW, one of the main uses of python is website scraping.

    https://musicbrainz.org/doc/MusicBrainz_API

    • some_guy@lemmy.sdf.org
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      9 months ago

      The best language for automation is the one you know best. The second best is one you have to learn.

      I think you could do this in bash with YouTube-dl.

      • Diabolo96@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        9 months ago

        Indeed. while my bash-fu is redimentary at best, I don’t think Bash can be used for web scrapping ? But I think he could use RSS to get the posts, then extract youtube links with Regex and use the dump feature of yt-dlp* to get the video category, title,etc by using jq to parse the json. Then, it’s probably just a matter of using curl to do the API calls and voilà.

        *yt-dlp is better maintained than youtube-dl, or so I heard.

        • some_guy@lemmy.sdf.org
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          9 months ago

          I built two scrapers for a website that hosts images and videos using bash.

          They’re educational, I swear! /s

          I looked through the html and figured out regexes for their media. The scripts will parse all the links on the thumbnail pages and then load the corresponding primary pages with curl. On those pages, it then uses wget to grab the file. Some additional pattern matching names the file to the name of the post.

          It’s probably convoluted, but you can accomplish a lot in bash if you want to.

    • Great Blue Heron@lemmy.caOP
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      I find Python difficult - no idea why, it just doesn’t feel right. I’ve tried a few times but never been able to do anything useful with it - that’s why it’s not in my list above. It does seem though that my proposed project, and development “style”, is best suited to Python. Maybe it’s time to try again.

      • some_guy@lemmy.sdf.org
        link
        fedilink
        arrow-up
        0
        arrow-down
        1
        ·
        9 months ago

        If you work in bash and don’t like python, maybe it’s too strict. Look into Ruby. It was inspired by Perl. I found it more to my style in that there are many correct solutions and not one implied correct solution.