PDA

View Full Version : Any Perl programmer want to make a few bucks?



Mike Anderson
2006-02-03, 21:08
I know this is a strange request, as it has nothing to do with the Squeezebox per se, but I also know there are a lot of Perl programmers here. So….

Can someone out there write me a little Perl script? It'd be a dead simple program, and I can pay you for your time and trouble.

Basically all it has to do is look at the contents of a directory, and output a text file whose exact contents would depend on the names and numbers of files in that directory. Bonus points if it can do this for multiple subdirectories in a directory. I'll give you the specs for the file if you want to see it before taking on the job.

This would be done on a Windows PC (I'd be working from DOS I suppose.)

Anyone? This would probably take 10 minutes for a decent programmer to write, I suppose, but I'll pay $50, or whatever it's worth to you, depending on how hard it is to do.

Thanks,
Mike

Victor
2006-02-03, 22:37
Post more specifics please. This sounds like a 2 minute code dump :)

Mike Anderson
2006-02-03, 23:15
I posted this to jobs.perl.org by the way, as well as a Perl newsgroups

You can assume the directory is called DirectoryName, and contains files of only two types: .txt files, and .tif files. The directory will contain files something like this:

FileName1.txt
FileName1.tif

FileName2.txt
FileName2.tif
FileName3.tif
FileName4.tif

FileName5.txt
FileName5.tif
FileName6.tif
FileName7.tif
FileName8.tif

There is always a .txt file associated with one or more .tif files.

For example, FileName1.txt is obviously associated with FileName1.tif.

But there are three .tif files associated with FileName2.txt: FileName2.tif, FileName3.tif, and FileName4.tif.

And there are four .tif files associated with FileName5.txt: FileName5.tif, FileName6.tif, FileName7.tif, and FileName8.tif

The name of a .txt file always corresponds exactly to the name of the first .tif file associated with it (excepting for the file extension, of course).

There can be any number of .tif files associated with a .txt file. You can assume the .tif files associated with a .txt file are always numbered in sequence, starting with the number of the .txt file.

However, you should assume that the number of .tif files associated with a .txt file increases with the number of .txt files, as in my example. The first .txt file might have five .tif files with it, the second .txt file might have only one .tif file associated with it, and so on.

The outputted text file should have the following format for the above group of files:

@FULLTEXT DOC

; Record 1
@C BEGDOC# FileName1
@C ENDDOC# FileName1
@C PGCount 1
@T FileName1
@D @I\DirectoryName\
FileName1.tif

; Record 2
@C BEGDOC# FileName2
@C ENDDOC# FileName4
@C PGCount 3
@T FileName2
@D @I\DirectoryName\
FileName2.tif
FileName3.tif
FileName4.tif

; Record 3
@C BEGDOC# FileName5
@C ENDDOC# FileName8
@C PGCount 4
@T FileName5
@D @I\DirectoryName\
FileName5.tif
FileName6.tif
FileName7.tif
FileName8.tif

You can see that after BEGDOC#, you put the file name corresponding to the first .tif file that corresponds to a .txt file, and after ENDDOC#, you put the name of the last .tif file corresponding to that .txt file. After PGCount, you put the number of .tif files that correspond to that .txt file. After @T, you put the name of the .txt file (minus the extension). And then below the @D file, you list all the .tif files that correspond to that .txt file.

There can be any number of .txt files, sometimes quite large, and any (nonzero) number of .tif files associated with each .txt file.

The outputted text file should be called DirectoryName.dii where DirectoryName is the name of the directory holding the files, as above.

That's it.

I have a large number of directories to process in this manner, so if you want to build a script that automatically searches for subdirectories containing files like this, and creates a .dii file for each such subdirectory, that'd be even better.

Victor
2006-02-04, 00:05
ok that's a bit more than a 2 minute brain dump :)

I am up late tonight, the baby doesn't feel like sleeping, which means no one in the house sleeps :) Soon as I get a 10 minute break in the action I'll whip this out for you

Mike Anderson
2006-02-04, 00:17
^^^ Cool!

Then just PM me with your address, and I'll mail you a check.

Thx

Victor
2006-02-04, 01:42
Here ya go -- give this a shot. The scripts needs 2 arguments:
1) The first argument is the log file where everything gets written to
2) The second (and more) arguments are the directories to look in

So you run it like this:
tagger.pl logfile.txt /path/to/dir1 /path/to/dir2

Dir names can either be absolute or relative to your current directory

It only looks explicitly in the directories listed and not in any subdirectories. If you want it to do that let me know and it's a tiny tweak of the code.

BTW, I deliberately wrote this unlike my typical perl coding syle (which tends to be a little...um...concise let's say). This should be much easier to read for the average person in case they want to modify it.

One more thing -- I don't have a Windows machine to test on. It *should* work ok, but let me know if there's a weird path problem I hadn't thought of.
Let me know how it works.

Victor
2006-02-04, 01:44
Ooops, just noticed I missed this line:
"The outputted text file should be called DirectoryName.dii where DirectoryName is the name of the directory holding the files, as above."

Sorry about that. Fixing it now...

I am going to assume you want all the *.dii files in the same dir after you're done. If that's not the case, let me know as well.

Victor
2006-02-04, 02:25
Ok try #2 :)

Now the code searches all the subdirectories of any directories given on the command line and it tries to be semi-smart about not getting itself into an infinite loop if a subdir is a link to a dir it's already seen, or if you name a dir twice by mistake.

It dumps all the .dii in the directory you call the script from and names them after the directories it looked in. I chose to have an underscore replace slashes in the name of the dir.

Hope this works for ya.

Victor

Mike Anderson
2006-02-04, 09:41
Thanks much, Victor. I checked it out, but there appears to be a bug or two.

1) I pointed it at a directory named C:\TestDir1:

File1.txt
File1.tif

File2.txt
File2.tif
File3.tif
File4.tif

File5.txt
File5.tif
File6.tif
File7.tif
File8.tif

And the output it gave me is:

@FULLTEXT DOC

;RECORD 1
@C BEGDOC# File1
@C ENDDOC# File8
@C PGCOUNT 8
@T File5
@D @I\C:/TestDir1
File1.tif
File2.tif
File3.tif
File4.tif
File5.tif
File6.tif
File7.tif
File8.tif


As described above, there should be three separate records outputted, not one.

Also, there shouldn't be a c:/ after the @I.

Finally, the output filename was something much longer and more complicated than TestDir1.dii (and the "dii" part was in the name, not the extension.)

I did mention I'm doing this on a Windows platform, yes?

Thanks

Victor
2006-02-04, 10:45
I am confused about your spec then, so a few questions:

1) The output file is named to the full path of the directory it searched, with slashes being replaced with underscores. Do you want just the dir name, not the full path? If so, how do you deal with the possibility of multiple dirs that are named the same?

2) I did misunderstand the spec about the separate records. In your example, did you mean to to say that when a .txt file is found, it's associated with every .tif file found in that dir after that until we hit another .txt file?

Mike Anderson
2006-02-04, 10:49
I am confused about your spec then, so a few questions:

1) The output file is named to the full path of the directory it searched, with slashes being replaced with underscores. Do you want just the dir name, not the full path? If so, how do you deal with the possibility of multiple dirs that are named the same?

2) I did misunderstand the spec about the separate records. In your example, did you mean to to say that when a .txt file is found, it's associated with every .tif file found in that dir after that until we hit another .txt file?

1) The name of the output file should just be the name of the lowest subdirectory. E.g. if the full path is:

c:\Dir1\Subdir1\Subdir2

Then the output file should be named Subdir2.dii

Don't worry about subdirectories with the same name, it shouldn't be a problem.

2) My apologies for not having written clearer specs. But yes, what you've said above is exactly correct. And the output for the example I posted should be just as I specified above in post #3.

thanks,
Mike

Victor
2006-02-04, 10:58
So after the @I, do you put full path name or just the dir name? And if the dir is a subdir of another (i.e. dir1/dir2), what do you put?

Mike Anderson
2006-02-04, 11:03
After the @I, just put the subdirectory name. It should be the name of the subdirectory that contains the .txt and .tif files. (And you shouldn't have to worry about there being any subdirectories lower than that, or any identically named directories.)

thanks again,
Mike

Mike Anderson
2006-02-04, 11:38
So after the @I, do you put full path name or just the dir name? And if the dir is a subdir of another (i.e. dir1/dir2), what do you put?

Whoops, just to clarify: If the directory with the .txt and tif files is named SubDir1, then that whole line with the @I should show:

@D @I\SubDir1\

In the example quoted, if the dir is a subdir of another (i.e. dir1/dir2) then it should be:

@D @I\Dir2\

thanks,
Mike

MrC
2006-02-04, 12:22
How about taking this discussion offline since it has no relation to Slim nor music in general.

Mike Anderson
2006-02-04, 12:28
Right -- Victor, you can email at mikeand1@comcast.net

Thanks,
Mike

Victor
2006-02-05, 02:17
Sorry for the off-topic spam :)

If a mod wants to nuke this thread, feel free to do so