My process? Microsoft Excel. Notepad++. CMD. Filezilla. Waterfox. youtube-dl

My process?  Microsoft Excel.  Notepad++.  CMD.  Filezilla. Waterfox.  youtube-dl [to capture them from the Vine service], probably a few other tools.

I create the XML metas in Excel, but since Excel (I have ancient Excel 2000 – yes, 2000) doesn’t allow me to have separate lines per row, I use the ^ to signify CR/LF then use a nifty program called SNR (or FNR?) for Search/Find and Replace – that lets me change all of the XMLs to their proper form at once.  I used to use one called BK Replacem, but that stopped working around WinXP.  I love this because I can correct massive mistakes in thousands of XML files already created in subdirectories: for example, I keep carrying over a persistent error where I use the label: “medatype” instead of “mediatype”. and my eyes miss it every time, so I have to make a massive correction on my computer all at once for them.

The dirs + XML files that I will be transferring over, I create using a little VBA (Visual Basic) code that does the work for me.  I also make various batch files to move and rename files en masse.  Not powershell.  Just BAT, created by dir /b > z.txt, open in notepad, open in excel, add the activity (like move or whatever I need), back to notepad, save as a batch file, go to cmd and run it.  Seems to be the best process for me.

I create the submission URLs in Excel (and now the Edit URLs because I need to edit a handful of them and I hate unnecessary clicking through screens and would rather go straight to the item editor, which I can do now through Excel easily)

I use an extension on my Waterfox/Firefox browser called, “Open Multiple Locations”.  I set them to 10 seconds apart, which is a very long length of time for computers but I want to be completely respectful of not overwhelming the IA processes. I set the priority to -6.  I found my browser can handle 250 in a row without throwing up all over my poor little laptop here.

I won’t be needing to upload too many more: I’m nearly finished with this part of the process (I haven’t tallied how many I have left yet, but I know I’m nearly done with this part).  It’s mostly catching stragglers, mistakes I made in preparing the files that caused them to fail and getting the videos I’m missing and giving them proper markup.

Once this is complete, I’ll be working with my little library here, working with the files and the metadata and sorting it all out.

Sorry it’s not a step-by-step process, PDPolice but *I’m* just figuring out the process as I go along as best I can.

Certain things I do the way I do them anyway: I’m an excel guy and I’m used to working with 50 to 100,000 files at once.  I’d work with millions but neither my ancient laptop nor is any of my software powerful enough for that ’cause at that level I’d have to go to :  DATABASES.  I hate working with database schemas.  They suck.  So limited.  So fixed.  I know things are better now: JSON is supposed to make the world a better place and all that… but I’m still of the CSV mindset and I like my rows + columns and the ability to see “everything at once” rather than hidden.

I’m looking forward to sparse distributed representations of concepts being easier and easier to manage on computers.

I work with what I can though.  I need the flexibility because when your data defines your eventual schema, and uncertainty rules the whole process, all you can do is stick to basic stuff like spatiotemperal database representations and things like identifiers and keywords, and hope that the data itself will shape the final result.

Anyway, I’m babbling. I need coffee.   Short of is, I use whatever tools I have at hand to accomplish tasks.  They might not be the best tools but they work.  I had to create a several billion item MS Access 2000 database to crosslink a public domain thesaurus to make a crosslinked thesaurus whose purpose I *still* don’t know yet [but I did it anyway], amazon’d it, put it up here on IA too – my own pirate – and, yeah, maybe I’d better get that coffee.

Leave a comment

Your email address will not be published. Required fields are marked *

+ 6 = twelve

Leave a Reply