Integrated file upload is provided by some extant wikis (such as TWiki). Others allow it in a halfhearted manner... using a separate script to upload files to the server, which can then be referenced in a wiki page as a simple hyperlink (e.g., http://wiki.somewhere.com/uploads/myfile.pdf).

One of the potentials for wikis is for use as groupware. Think LotusNotes. Basically, communal workspaces or databases in which shared documents may reside. Simple groupware applications are already possible with wiki, in which teams maintain a shared set of wiki documents. But for full-fledged groupware, we need the capability to upload files to the wiki.

Goals

Idea

For sites that have enabled uploads, display an additional link at the bottom of each page: Attach File. This link allows one to attach one or more files to a page. The script will store these files in the database.

Each page that has attachments will have a floating <div> at the top-right of the page body. This will contain icons for each attachment, as well as the attachment name, as well as an link to attach additional files.

Clicking on the attachment icon or name will display the attachment management page for that attachment. It will show links for each version being archived. Users may download any of the versions. Users may also upload a new version of the document to be added to the top of the version list. Lastly, the document may be deleted. This places a <deleted> token at the top of the version list. If it remains uncontested for ExpirationPeriod days, the attachment will be removed. To contest a deletion, one need simply upload a new version of the attachment.

The attachments will be stored in the page table. This will require schema changes. I'm not sure at this point how to distinguish a document as an attachment or as a wiki page. Perhaps we can add a mime-type field. Neither am I sure how to distinguish a document as being attached to a particular page. It could be as simple as naming it along the lines of PageName/attachment.pdf. Or we could do something more involved. :-) (My preference, I must warn you, is for simplicity. :-) -- ScottMoonen

Questions

How do the various open-source databases perform under stress? I.e., can we assume that allowing folks to insert 20MB files into the database will not degrade performance? Obviously, in addition to setting web server access controls, we will recommend that administrators set a reasonable but restrictive maximum file size for uploads (which we'll provide as a configuration option).

Does the model presented above make sense to everyone? Is there a better way to achieve this?

How do we show attachment actions in RecentChanges? We need to do something to facilitate peer review of changes made to document attachments!

Comments

Fire away. What do you like or dislike about this idea?


Great initiative and an obvious extension to 'Tavi. I think one use of attachments is, or could be, offline editing, ie; being able to edit a Wiki page via a desktop editor and then upload the results rather than use a textarea within a browser, so for this kind of usage 'Tavi is mostly there. Another usage is for large multi-media files that could potentially be 100s of Mbs for mpegs so the idea of pushing them into an SQL database and also diff'ing them for changes is not so desirable. Perhaps there could be 3 paths for uploads and each has different policy settings. I know you want to keep things simple (totally concur) but there might be a need to evolve this way in any case. The option of just linking to files on disk and (seperately) not diff'ing them could be useful.-- MarkConstable


I second the "3 paths" proposal - The "everything stored in the database" model is (in my experience anyhow) the "feature" most detested by system administators when dealing with products such as Outlook, Groupwise and Notes. I think it lends itself well to lots of small documents but not good for lots of (potentially) large documents. Storing large files in MySQL would create signifcant concerns about:

Again, this based mainly on my (negative) experiences with Exchange server and Groupwise... maybe Tavi can overcome them :^)

Agree that option to store files on disk instead of in db would be good. To avoid making it too complicated it could be simply a matter of specifying in config.php an "incoming" directory.

Considerations:

  1. can incoming files overwrite existing files of same name?
  2. can files be linked to in the incoming directory or only after the administrator has moved them to somewhere else?

Hmm, maybe it's not so simple after all...

I most definitely want to do versioning of attachments. That's very important for groupware settings. And versioning is lots easier with a database.
Maybe, just maybe, we can throw together a version of the pagestore object that uses files instead of a database. Would that be satisfactory? -- ScottMoonen

As to "considerations" 1 and 2 above - [RCS] seems to handle that OK. I have tested one implementation of the RCS solution using [FaqWiz] - I don't know if either of those are useful reference points for generating any ideas for you in the Tavi context though... but thought I'd mention them anyhow :-)

The pagestore file storage option sounds like it could be good compromise if it were possible?


Regarding linking of attachments to pages, I would recommend separate database tables which tracked not only attachment versioning, but also page-attachment relationships. Attachments could be either stored in the database, or alternatively in a script-specified directory, their location and filename recorded in the table entry.

Finally, it would be a convenient feature to have ftp access for bulk file uploads, with an automated script which would read the contents of a specified directory and attach them to a particular page. Some filenaming convention (such as FileNameDateVersion.ext) would have to be followed. -- Joshua McKenty


I think that after any image uploadings, whatever the way it is realized, any user would be able to include simply any image in a wiki page. For me the solution is to create something like the InterWiki procedure, something named for instance ImageWiki. To include an image inside a page, the syntax would be then

ImageWiki:NewImage
In fact, why not reserved the so called page ImageWiki to a list of all the available images. The upload would be driven from the ImageWiki page. From this latter would correspond also a special table in the Mysql database with the name, the path, the creation date, the last revision, the linked pages of each image introduced. -- LaurentJacques

There is a page to alternative upload methods that I just stumbled across - UploadScripts. -- Quadraman


Whatever way you do it: its a crucial feature.