Eric van der Vlist
|
160a4e7326
|
Log only when WP_DEBUG == true
|
2 years ago |
Eric van der Vlist
|
5ee8aba026
|
Header
|
2 years ago |
Eric van der Vlist
|
a2f20cb296
|
Trying netbeans...
|
2 years ago |
Eric van der Vlist
|
b7c70cfd10
|
Reformating
|
2 years ago |
Eric van der Vlist
|
0d4af6419a
|
cleanup
|
2 years ago |
Eric van der Vlist
|
ba2cc3ec40
|
Updating these views and adding a third one to list the rchives with their status.
|
2 years ago |
Eric van der Vlist
|
4ef5fe2994
|
Using these views
|
2 years ago |
Eric van der Vlist
|
0097514442
|
Adding views
|
2 years ago |
Eric van der Vlist
|
7d2b6e53b3
|
Indentation
|
2 years ago |
Eric van der Vlist
|
49afb2b9ea
|
Massive refactoring and bug fixes
|
2 years ago |
Eric van der Vlist
|
460c77f116
|
Check that the archive directory is writable
|
2 years ago |
Eric van der Vlist
|
4d62124a03
|
PHP7 constructor syntax
|
2 years ago |
Eric van der Vlist
|
885867e065
|
Removing "&" in function calls by reference (support of PHP 5.6+)
|
2 years ago |
Eric van der Vlist
|
21807536ca
|
Deleting what doesn't belong to Wordpress
|
2 years ago |
Eric van der Vlist
|
10c0d87b93
|
Markdown
|
2 years ago |
Eric van der Vlist
|
b9c833fd17
|
Removing intermediary directories
|
2 years ago |
Eric van der Vlist
|
f907af85c7
|
Fixing #9
|
9 years ago |
Eric van der Vlist
|
5acb10101f
|
Rewriting resources with no archived out links
|
11 years ago |
Eric van der Vlist
|
4473ad6e15
|
Support HTML @background
|
11 years ago |
Eric van der Vlist
|
94d335170f
|
Map application/xhtml+xml to .html
|
11 years ago |
Eric van der Vlist
|
5e2b674092
|
Store the craw log into the archive
|
11 years ago |
Eric van der Vlist
|
c25b18f9f5
|
Support HTML embed/@src
|
11 years ago |
Eric van der Vlist
|
16ef7979b0
|
Trying to guess content types
|
11 years ago |
Eric van der Vlist
|
bc581fabf9
|
Adapting relative links to match the structure of the browsable archive
|
11 years ago |
Eric van der Vlist
|
bf2980567a
|
Cleaning the algorithm to compute friendly local names.
|
11 years ago |
Eric van der Vlist
|
cfaf8ae9c2
|
Adding XSLTUnit tests for the local-name function.
|
11 years ago |
Eric van der Vlist
|
a7c3525ef6
|
Hmmm... HTML should be serialized as HTML, of course!
|
11 years ago |
Eric van der Vlist
|
c79bd8e49c
|
Forcing HTML content type for XHTML documents
|
11 years ago |
Eric van der Vlist
|
9bce34f7c6
|
Rewriting links in HTML and CSS resources within WARC archives
|
11 years ago |
Eric van der Vlist
|
5b162a64df
|
WARC mail extract loop
|
11 years ago |
Eric van der Vlist
|
466d4473ce
|
Generating a resource index to facilitate further processing.
|
11 years ago |
Eric van der Vlist
|
675ed04aba
|
Download and convert the crawl log
|
11 years ago |
Eric van der Vlist
|
6f64c7f8a9
|
Handling payload content types
|
11 years ago |
Eric van der Vlist
|
be1a361ab9
|
Implementing yet another WARC parser (the heritrix one didn't work well with Orbeon due to http client library conflicts).
|
11 years ago |
Eric van der Vlist
|
307b6d2a72
|
Adding whois records
|
11 years ago |
Eric van der Vlist
|
22c3028c38
|
First stab of WARC packaging.
|
11 years ago |
Eric van der Vlist
|
51c2058aa6
|
Queue an action to package the Heritrix WARC.
|
11 years ago |
Eric van der Vlist
|
b346236789
|
Adding a mechanism to delay actions in the queue.
|
11 years ago |
Eric van der Vlist
|
3bcb813cb7
|
Unpause Heritrix job.
|
11 years ago |
Eric van der Vlist
|
f25a9246bc
|
Modifying the way the Heritrix (spring) config file is generated since it seems to be picky on whitespaces and indentation...
|
11 years ago |
Eric van der Vlist
|
a3fa073667
|
Update to follow changes to Orbeon Forms experimental features...
|
11 years ago |
Eric van der Vlist
|
a1dc635607
|
Update to follow changes to Orbeon Forms experimental features...
|
11 years ago |
Eric van der Vlist
|
57daa703da
|
Now building and launching Heritrix jobs...
|
11 years ago |
Eric van der Vlist
|
be2f974a4c
|
Update to follow changes to Orbeon Forms experimental features...
|
11 years ago |
Eric van der Vlist
|
c4c4108025
|
Starting to write pipeline actions that interact with an Heritrix server
|
11 years ago |
Eric van der Vlist
|
ad35672603
|
Still work in progress, but the WARC archive now validates with warc-tools' warcvalid.py...
|
11 years ago |
Eric van der Vlist
|
ba51ddfb0b
|
Starting to support content lengths in warc archives
|
11 years ago |
Eric van der Vlist
|
9d99928c60
|
Removing the last action from the queue
|
11 years ago |
Eric van der Vlist
|
01a66903f3
|
First version that can produce a packaged archive.
|
11 years ago |
Eric van der Vlist
|
5ac9ea90bb
|
Packaging resources that have not been rewritten...
|
11 years ago |