News:

The anti-spam plugins have stopped being effective. Registration is back to requiring approval. After registering, you must ALSO email me with your username, so that I can manually approve your account.

Main Menu

Is there a program to search for, and replace content in muliple HTM files?

Started by fesworks, September 13, 2008, 04:24:49 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

fesworks

Xepher, I think you told me about it before, but I don't remember.

I have a couple hundred pages I'd like to change just little things in. They have the same coding asside from specific images (comics) and the navigation links.

www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com

Miluette

You can do that in Dreamweaver. I used it for stuff I needed to change in my few hundred archive pages.

If there's something else it sure would be handier.
And wasn't it you who told me,
"The sun would always chase the day"?

fesworks

is it... easy to find and figure out how to do that? I've opened Dreamweaver like once, then I came to after being comatose for an indeterminable amount of time, and in a pool of my own drool... then I closed it....

www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com

griever

I've attached screenshots to help walk though. ^_^  I have a Mac and Dreamweaver 8 so the layout might be a little different but the process should be the same.

FIRST: MAKE NEW FOLDER AND MAKE COPIES OF ALL THE PAGES YOU'RE GOING TO WORK ON!!!  I only say this because it's the first time and there's no "Undo" if you make a mistake.  Which I learned the hard way.  *cough*

1:  Edit > Find and Replace.  Easy!

Here's where things get a little detail-oriented.

2: Find the "Find in" menu.  Choose from any of the selected.  Because you said you have 200+ documents to get through, I chose "Folder..."  However, if you open multiple documents (say, do them in batches of 10), you would choose "Open Documents".

3: If you choose "Folder...", you may or may not receive a prompt to find the folder you want to look in.  If not, it should be nearby - mine says "Phoexix: Users: MK: Sites: sendmeonmyway"  USE THE COPIES YOU MADE EARLIER (if you choose this method).

4: Go to the "Search" menu.  Now you have a bunch of choices.  I chose "Source Code", which meant that it will only go through your HTML.

5: Then, in the "Find" box, type what you want to find and in the "Replace" box, type what you want it to be replaced with.  Full HTML tags can be used here.

6: Click "Replace All".  You might get a warning, like "Are you sure you want to do this you crazy foolish person you??", you might not.  But that's why we have the test folder!

7: As the process runs, a box should come up with a report of all the things it's replaced.  You can double check it, if you want, but as long as the code is the same (like in a template), you should be okay.

http://img229.imageshack.us/my.php?image=picture8ch3.png (Step 1)
http://img246.imageshack.us/my.php?image=picture9uk7.png (Steps 2-3)
http://img99.imageshack.us/my.php?image=picture11cy3.png (Steps 4-6)
"You can get all A's and still flunk life." (Walker Percy)

Xepher

Well, if dreamweaver made you go comatose, then my suggestion is gonna really break your brain...

find /home/fesworks/COPY_html/ -iname "*.php" -exec sed -i 's/ugly/beautiful/g' {} ;

Where the path is the folder you want to start with, "*.php" is the type of files, "ugly" is the old string and "beautiful" is the new one. That is to say, the above example will find all php files in /home/fesworks/COPY_html and replace all occurences of the four-letter string "ugly" with "beautiful". Be careful if you have stuff involving slashes to be replaced though, and as noted above, test and try on COPYS, not the real files... at least until you have the command right, then it can be quite handy to just edit them all in place.


fesworks

Quote from: Xepher on September 15, 2008, 02:59:48 AM
Well, if dreamweaver made you go comatose, then my suggestion is gonna really break your brain...

find /home/fesworks/COPY_html/ -iname "*.php" -exec sed -i 's/ugly/beautiful/g' {} ;

Where the path is the folder you want to start with, "*.php" is the type of files, "ugly" is the old string and "beautiful" is the new one. That is to say, the above example will find all php files in /home/fesworks/COPY_html and replace all occurences of the four-letter string "ugly" with "beautiful". Be careful if you have stuff involving slashes to be replaced though, and as noted above, test and try on COPYS, not the real files... at least until you have the command right, then it can be quite handy to just edit them all in place.



Is this a BATCH command? Where would I run this code? Looks like you intend it to run on the server.

I'm guessing for slashes and "<" and the like, I'll have to use HTML code such as "%20" for a " " (space), etc... right? (I only know that %20 is a space, but I know I can easily find a list online somewhere)

www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com

fesworks


www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com

Xepher

Yeah, it's code to run in linux... aka "on the server" if you aren't running Ubuntu or something at home. It's really just an example to show what you can do if you want to look into it. If you're in SSH on the server, type "man sed" to get the manual page for sed, and it should kind of explain what to do for special characters. Usually it's a matter of backslash escapes. Eg. "Fes & Ernst" becomes "Fes \& Ernst"

It's kinda complicated to learn, but really powerful when you do. Google, as always, is your friend.

Databits

It's as easy as pie.

perl -p -i -e 's/target/replacement/g' $(find -name '*.htm')

The difference being that this just modifies the files directly, rather than making a copy. You should probably look into "regular expressions" as well, since you'll be using them in any sort of mass replacement on Linux (even in Xepher's version).
(\_/)    ~Relakuyae D'Selemae
(o.O)    
(")_(")  [Libre Office] [Chrome]

Xepher

Actually the one I listed modifies the files directly too. That's why I said he should make his own copy first. That's the "-i" argument to sed, it's for "in-place" modification. But yeah, the search/replace is done with basic regular expression syntax either way.

fesworks

Quote from: Databits on September 15, 2008, 05:46:50 PM
It's as easy as pie.

perl -p -i -e 's/target/replacement/g' $(find -name '*.htm')

The difference being that this just modifies the files directly, rather than making a copy. You should probably look into "regular expressions" as well, since you'll be using them in any sort of mass replacement on Linux (even in Xepher's version).

Where do I edit this to call the directory the files are in?

Also, I've been working with Xepher's... doesn;t seem to work. I assume I open up the "console" on WinSCP, and enter it in. Well, since the files are HTM, I adjusted accordingly.

Also, I want to replace <a href="index.php">
to
<a href="http://fesandernst.com">

But going through a bunch of backwards slashes before every non-character, it didn't work. So I decided just to change the "php" to "html", So I could so the same, and change them back to "php" later, when I change the index page back... so:

find /home/fesworks/public_html/PSI/crappy/ -iname "*.htm" -exec sed -i 's/php/html/g' {} ;

Didn't even work. Ran into a syntax error and quit. :/


Also, I guess, I was hoping that I could have done something that could find and replace an entire blocking of code in multiple files, but it does not look like this code is capable of doing that.

I'm gonna try the Dreamweaver thing next.

www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com

fesworks


www.PSIwebcomic.com
www.TheShifterArchive.com
www.ArdraComic.com
www.WebcomicBeacon.com