Create mirror from website baker

noname8

I decided to go with short.php because
- all the links and site structure in wb contains .php urls. So just mirroring it to .html jus cannot work
- WB also contains FULL (not relative) paths for every files that is in the site, and these needs to be rewritten
- I like better the urls that are /mything than /pages/mything.php and maybe google likes them more also

so dev4me's .htaccess + short.php contained already 70% what was requirement. I tried the httrack-website copier and that just didn't work. Breaks the site with broken links after clicking links few times. Seems that it cannot rebuild WB atleast not with .php as original urls.

after dev4me solution i'm trying to implement a cache and already made curl that calls the main server and rewrites urls with mirror domain and path and now just need cacheing and few...alot more tweaks.

CodeALot

 :-o :-o

Short.php is in no way intended to create some kind of "mirror-without-a-database" from a WB-website. It is a script that will eliminate /pages/ from the URL's.
So how is it that you "decided to go with short.php"?

noname8

Thank you for your replies, I decided to go with http://short.dev4me.nl/

and in the mirror server, decided to have php-support after all (but no sql)
and have the same script but with modifications to make curl to the main site, rewrite all the urls with replace, and then cache the file. Works now 80%, work still in progress..

evaki

@an die Helfer
Die vorgelegten Vorschläge bitte ignorieren.
Hintergrund:
Die kamen aus der "Grabbelkiste", was oft zwar nützlich daherkommt, ist in diesem Falle "für die Tonne". Ich guck mir nicht jedes Teil an, das mir "rübergeschoben" wird, diesmal nun doch, weil mir das verdächtig "schlicht" erschien.

Man kann das als Idee nehmen, mehr nur, wenn man noch ein wenig "bastelt".
Habe es am Nachmittag mal getestet. Als Idee tatsächlich interessant. Wenn man z.B. $cacheFile umschreibt
$cacheFile=$_SERVER['DOCUMENT_ROOT']."/htmlout/".$row['link'].".html";
also der jeweiligen Seite den entsprechenden Dateinamen (aus der DB .$row['link'] )verpaßt, klappt das für Dateien in wb_root gut. Für Untermenüs dagegen "noch" nicht, dafür müßten die Vezeichnisse schon vorher vorhanden sein. Dann aber funktioniert auch das. Wer Zeit für sowas hat, kann das ja erweitern/anpassen.

Dies wäre auch nur eine Teilaufgabe, wobei fraglich ist, ob diese zufriedenstellend im Sinne des Topic wäre. Da hinge ja noch'n Rattenschwanz dran. Der ist zwar zu lösen, aber ein alternativer Ansatz wäre vielleicht sinnvoller.

Ich rate eher zu einem Crawler-Script (die man wahrscheinlich nur noch aus den frühen Tagen kennt)

Nochmals Entschuldigung für das ungeprüfte Raufladen.

MfG. Evaki

evaki

Alternative:
https://forum.WebsiteBaker.org/index.php/topic,14663.msg92478.html#msg92478

evaki

The other tasks can be solved with PHP

evaki

#6
Cache your pages, and crawl once
Renew a page:
After you have updated a page, delete the changed (html) page. -and crawl  :-D

Your Template:
<?php
//$cacheFile=$_SERVER['DOCUMENT_ROOT']."/wb/pages/".constant('MENU_TITLE').".html";
$cacheFile=$_SERVER['DOCUMENT_ROOT']."/htmlout/".constant('MENU_TITLE').".html";
if (
file_exists($cacheFile)) //we can read this cache file back reduce database load
{
header("Content-Type: text/html");
readfile($cacheFile);
exit;
} else {
ob_start(); //start buffering so we can cache for future accesses
}
?>


<html>
<body>
<content>Hello, World!</content>
</html>
</body>


<?php

// get the buffer
$buffer ob_get_contents();

// end output buffering, the buffer content
// is sent to the client
ob_end_flush();

// now we create the cache file
$fp fopen($cacheFile"w");
fwrite($fp$buffer);
fclose($fp);
?>

Reg./MfG. Evaki

Martin Hecht

maybe something like wget -m -p -D yourdomain yourstartpage

noname8

#4
Thank you for replies

The mirror i would like to create should be .htm only, without .php and mysql. So just clone does not work.
Httrack i have used, but currenly using mac so i would like to have a bash script or php curl job.
[edit: httrack seems to work on mac too, but i would still prefer a quick script that i can run on the server]

ruebenwurzel


hgs

Did I get that right?
You want to create a backup of your WebsiteBaker live page via script?

My hoster offers this for the webspace and the database, I have this done automatically once a week by a cronjob. and my "mirror" is created automatically. A 2nd cronjob then deletes all older data after 15 weeks again automatically.
LG Harald

"Fange nie an, aufzuhören - höre nie auf, anzufangen." Marcus Tullius Cicero (106-43 v.Chr.)

noname8

I would like to create mirror from my baker website.
This mirror i would host in a serverless-server, no php-support, no mysql.

So to do this, i'm thinking of some kind of php curl crawler-script that i'd run every time i update content in the main baker site with admin, once a month. It would need to have different root url, but the same structure of pages and php>html and copy whole media dir also.

Has anybody done this, I don't know where to start...

Thanks for your help