MFCF/CSCF FAQ: Creating your own web pages

[Main index] [Index by keyword] [New and updated items]


If you find out-dated, inaccurate or confusing items or you think that something should be added in this file, please send e-mail to consultant@math.uwaterloo.ca.

  1. How to Make a Home Page
  2. Forbidden access to a home page!
  3. Is there any on-line reference of HTML tutorial?
  4. .htaccess Files Allow You to Control Your Web Pages
  5. How to Put a PDF on the WWW
  6. About CGIs
  7. CGI Internal Server Errors
  8. Common Cause of CGI Internal Server Error 500
  9. About PHP (on the Math student web servers in particular)
  10. Hey, my PHP scripts do not work any more
  11. Removing carriage-return characters from a file (UNIX)
  12. Access Control by Hostname/Domain
  13. Access Control by Userids(Names) and Passwords
  14. Access Control by UWdir (Quest) userid/password
  15. Access Statistics
  16. Making gzipped Files Available on the Web
  17. Student Web Server Upgrade to Apache 1.3
  18. Putting Private Data on the WWW


1. How to Make a Home Page

How can I set up a home page on MFCF-supported or CSCF-supported Math Faculty Unix machines?

  1. Decide Where To Put Your Home Page
  2. Initialize Your Web Space
  3. Set File Permissions

Assume all references to the names of files, directories, commands, etc. are case sensitive.

  1. Decide Where To Put Your Home Page

    What follows are recommended Web servers on which to set up your home page. To determine your individual URL (universal resource locator), simply append /~userid/ to the address.

    1. http://www.math.uwaterloo.ca
      Home pages created on any machine in the general math region will show up there.
    2. http://www.cs.uwaterloo.ca
      Home pages created on any machine in the general cs region will show up there.
    3. http://www.student.math.uwaterloo.ca
      Home pages created on any machine in the student.math region will show up there.
    4. http://www.student.cs.uwaterloo.ca
      Home pages created on any machine in the student.cs region will show up there.

    Other times a "Virtual Web Server" name is used; these usually begin with www. Examples at the time of writing are:

       http://www.scicom.uwaterloo.ca
       http://www.bioinformatics.uwaterloo.ca
    
    Not all Web server names use the www prefix:
       http://uwmachine.uwaterloo.ca/~userid/
    will sometimes work, while the following will not:
       http://www.uwmachine.uwaterloo.ca/~userid/
    Ask the people in charge of the research group machines about the machine on which you should create a home page. In rare cases, you and they may decide to arrange for a new Web server to be set up on one of their machines.


  2. Initialize Your Web Space

    Log in to the appropriate account. The starting point of your Web space is a directory named public_html under the home directory.

    Create this directory:

       % mkdir $HOME/public_html

    Next, create your HTML (hypertext markup language) document, call the file index.html, and place it in your public_html directory. index.html is the filename for which UW Web servers search when a URL is specified that corresponds to a directory.

    See the FAQ section On-line HTML Tutorial for the basics of HTML.

    At some point, you may decide to add other HTML files and subdirectories to your Web space. You may also want to include CGI scripts (programs). Keep in mind that to access them directly, the subpath needs to be appended to the URL. For example, to go directly to the file aboutme.html in your public_html/stories/ directory, the corresponding URL would end in .uwaterloo.ca/~userid/stories/aboutme.html.


  3. Set File Permissions

    File permissions give you control over who gets what kind of access to which files in your account. Your goal here is to permit users to access the files in your Web space.

    People accessing your account using Web browser software, such as Navigator and Internet Explorer, belong to the category of other user. So the specific file permissions you should set are the following:

    LocationPermissionsDescription
    Home directorya+xall execute
    public_html directorya+rxall read and execute
    HTML filesa+rall read
    CGI programsu+rxuser read and execute

    See the Unix Tutorial for information on using the chmod command to set file permissions.

    Your home page is now live and available for browsing on the Web!


Appears in other topics:

[^] Back to top

Last updated 2006-03-15 by d3wilkin
www/create_webpage.faq


2. Forbidden access to a home page!

I got the following message:
Your client does not have permission to get URL /~userid/ from this server.
How can I fix the problem?

Your file permissions are not set correctly. Your home directory and your public_html directory should both be searchable by everybody. The file ~/public_html/index.html should be readable by everybody. Do the following:


% cd 
% chmod o+x . public_html
% chmod o+r public_html/index.html

Appears in other topics:

[^] Back to top

Last updated 1997-03-18 by IB
www/permissions.faq


3. Is there any on-line reference of HTML tutorial?

Take a look at: HTML Goodies HTML Primers for some simple, humorous lessons. Or go to: HTML Tutorial from W3Schools for a more in-depth overview.

The old version of CS Club homepage also has a page for HTML tutorial. Look for the "Our Webspace" section. For all the possible nitty-gritty detail you could ever possibly want, try the W3 Consortium site (in particular, the "Technical Area").

[^] Back to top

Last updated 2006-03-15 by d3wilkin
www/html_reference.faq


4. .htaccess Files Allow You to Control Your Web Pages

What is a .htaccess file, and what can I do with it?

If you are working on web pages created by other people, for example if you are an administrative assistant who must make updates to departmental web pages, you may occasionally encounter in those web pages files named .htaccess. These files can be very important and should not be simply deleted without careful consideration.

Or, if you are creating new pages, personal or not, you may want to make use of the facilities these control files can provide.

For a definitive guide, see:

(Or see try the equivalent URLs on the web server where you create web pages; that will be more accurate in details for you).

Any of the listed directives which has a context of ".htaccess" can be placed in a file named ".htaccess" within the directories containing your web pages, and will have effects on all web pages in that directory and in sub-directories below.

This works for personal, per-user, pages, and also for non-personal pages (for example the web-page structure on http://www.math.uwaterloo.ca/, where this FAQ resides).

If you want to determine what a pre-existing .htaccess file is doing, you should look up each of the directives in it in:

Warning: If you are not a computer scientist, you may wish to find one to help you decipher the file. This configuration can get very complicated.

Other sections of this FAQ attempt to give specific examples, but, in general .htaccess files can allow you to control access to your files (prevent just anyone from seeing them), change the types of your files to affect how browsers handle them, and "Redirect" references to URLs in your web space to other locations. Additional directives, those which look like sort of like HTML containers, allow you to control the scope of the basic directives.

Don't forget to set the permissions on any .htaccess files so that they are world-readable.

See also:

[^] Back to top

Last updated 2002-10-17 by arpepper
www/features_htaccess.faq


5. How to Put a PDF on the WWW

I've been given a PDF to make available on the WWW; how should I do it ?

The quick answer is to not put (non-small) PDF's on the WWW, or if you must, make a link to them and warn the reader of the excessive download time. There are apparently no general PDF editors; one is intended to modify the original and rebuild the PDF, or use Acrobat to edit the layout only. Conversion to HTML is preferred, i.e. text becomes text. Tools like DreamWeaver or HotMetal are good for editing the results of mechanical conversion to HTML, e.g. by Word. If the PDF is mostly pictures, or if positioning is critical, then conversion to JPEG may be an alternative.

For another possible alternative, perhaps consider the technique discussed in Making gzipped Files Available on the Web , but be sure to carefully consider the caveats noted there.

See also:

[^] Back to top

Created by gwridley
Last updated 2003-11-28 by ARPepper
www/pdf.faq


6. About CGIs

I have a few CGI scripts for my homepage, but what do I need to do to run them here?

The answers here apply primarily to the major web servers such as www.student.math, but in most cases generalize to most Math Faculty web servers supported by CSCF/MFCF.

First take a look at a thorough guide to cgi security issues, such as Michael Van Biesbrouck's CGI security tutorial, or this link.

Do not, I repeat, do not simply rename your file to have the suffix .pl. On one or two Math web servers that will appear to work, but actually has some undesirable side-effects.

Instead, realize that CGI's are not enabled by default, but on most Math Faculty web servers you can arrange to have them run by putting appropriate directives in a .htaccess file in the same directory or above. Note that when run in this fashion from a personal web page, the SUEXEC feature of Apache is used so that the programs will be run as your userid, and can write to files writeable by only you.

Note that since personal CGIs are run as your own userid, you SHOULD NOT put world-write permission on files in order for your CGIs to write to them.

As an example of how to enable CGIs:

    AddHandler cgi-script .cgi

in an appropriate .htaccess file will cause all files with the suffix “.cgi” to be handled as CGIs by the Web server. This will apply to all files in the directory containing the .htaccess file, and also any sub-directories below it. Like most web-page files, .htaccess files must be world-readable.

    <Files *.pl>
        SetHandler cgi-script
    </Files>

Will cause the web server to treat any file whose name end in .pl as a CGI script. Note that on some servers this will happen anyway for the special case of .pl files; this is outdated behaviour which will probably disappear at some point.

Note that personal CGI pages must have the UNIX execute permission set; execute by the owner is enough, so you do not need to leave your CGI pages world-executable which would allow anyone else on your web server to run them locally with possibly bizarre results.

See also:

[^] Back to top

Last updated 2006-03-15 by d3wilkin
www/scripts_cgi.faq


7. CGI Internal Server Errors

My CGI causes an "internal server" error; what do I do ?

When a CGI produces invalid output, the diagnostic message is often less than informative.

First check here (in Student Web Server Upgrade to Apache 1.3 ) to see if your problem is because of Apache 1.3 slightly silly directory permission requirements.

If not, you can get hints about what's really going wrong by looking in

        /software/wwwapache-1.3_server/logs/error_log
        /software/wwwapache-1.3_server/logs/access_log
        /software/wwwapache-1.3_server/logs/cgi.log

at the time you're trying the failing CGI.

Make sure that you are on the host that has your WWW server. E.g.:

Those servers handle lots of requests, so sometimes it may be difficult to identify diagnostics relating to your own pages from the many recent entries in the various logs; be prepared to scan backwards from the bottom a long way, or use tools to search for your own errors.

To all intents and purposes, the first two lines output by a CGI must be:

         Content-type: text/html

The second of those two lines is hard to see, because it is actually empty. But, you must output the Content-type header, followed by an empty line. If you manage to do that, the server should not give the error, and a browser should interpret it somehow. (Of course, if the HTML is bad, the browser will do strange things).

In fact, the blank line is more important than the Content-type: header; everything up to the blank line is assumed by the web server to be headers, and must pass validity checks. Now, when a CGI program has errors, its error output (stderr) gets mingled with any output generated (usually the error output is seen first, but this is not entirely deterministic). If those error messages do get seen before the blank line, they get interpreted by the web server as headers, and will be diagnosed as invalid, resulting in the server error returned to the browser.

A simple way to debug the actual output of your CGI program is to insert commands at the top of it that cause it to output the following two lines:

Content-type: text/plain

The second of the two lines is blank. What this is doing is outputting a valid set of headers (the single Content-type: line) followed by a blank line. The text/plain causes the web browser to simply display whatever follows as text, without interpreting it as HTML and obeying the markup therein. This should allow you to see any problems with the headers generated by the actual CGI code.

If your CGI might generate error output after running mostly successfully (for example, this might happen to a perl script for a number of reasons, such as inappropriate use of an undefined value), another way to debug it is to call the actual CGI program from a wrapper which sends all error output (stderr, unit 2) to /dev/null (or alternatively to a file for your examination--be careful because anyone on the Web can then cause such output to be written to that file).

See also:

[^] Back to top

Last updated 2006-03-15 by d3wilkin
www/scripts_error.faq


8. Common Cause of CGI Internal Server Error 500

My CGI causes an "internal server" error 500; what do I do ?

If a CGI script file is not valid, it can cause that diagnostic.

A common reason for a script file being invalid is that it was produced on another operating system, and then uploaded to the web server. If the operating system uses the sqeuence of carriage-return (hex 0D) followed by linefeed (hex 0A) then a UNIX web server will typically not like the first line of the file which is supposed to cause the appropriate processing program to be invoked.

That is, the line

        #!/software/php/bin/php

will be seen as

        #!/software/php/bin/php^M

And there is no such program as php^M (where "^M" represents the usually invisible carriage-return character (hex 0D)).

Removing the carriage-return can be done in any number of ways. See Removing carriage-return characters from a file (UNIX) .

See also:

[^] Back to top

Last updated 2005-03-16 by arpepper
www/scripts_crname.faq


9. About PHP (on the Math student web servers in particular)

Can I run PHP on MFCF-supported web servers?

The answers here apply primarily to www.student.math and www.student.cs, but in most cases generalize to most Math Faculty web servers supported by MFCF.

Ask a system administrator whether the facility is available or can be made available on the MFCF-supported web server you use.

You can run php pages on the www.student.math.uwaterloo.ca and www.student.cs.uwaterloo.ca web servers. but there are caveats.

Examples detailing this can be found in each of the following two:

When signed on to the student.cs region, you can access the corresponding source directories:

Note that you cannot read the pages under the latter directory, but the web server effectively can. The pages under each are the same, except the first uses a ".phtml" extension while the second uses ".php". Many favour using ".phtml" instead of ".php" when php pages are done in the way I describe, but it's up to you.

Note that you must start each page with

     #!/software/php/bin/php

or

     #!/xhbin/php

As also required, the .htaccess files in each of the above directories contain one of the following directives (or equivalent) to cause the files to be interpreted as php (assuming the "#!/software/php/bin/php" has been inserted). You must add such directives in a .htaccess file which is in effect for the directory containing php (creating such a .htaccess file, if necessary).

     AddHandler cgi-script .phtml
     AddHandler cgi-script .php

As is the case with all CGI programs, they must have the appropriate UNIX execute permission bit set. In the case of personal PHP pages, execute for yourself is sufficient, since CGI's in personal web space are run as the user who owns the space. That is, for personal PHP pages, you do not need to set the permissions which would allow arbitrary other users on your machine to run them.

Here is what the permissions should look like:

  -rwxr--r--   1 arpepper arpepper      70 Aug 30  2001 readfile.phtml
  -rwx------   1 arpepper arpepper      70 Aug 30  2001 readfile.php
except your userid should be there instead of mine. That is, world read is optional, but world execute seems inadvisable.

We would have concerns about making php generally available in a manner which does not require the "#!/software/php/bin/php" line, since implementation details of Apache mean it provides a means for users to run programs and create files as some semi-anonymous userid, usually "nobody" or "www". For this reason, even if we got built-in php available, we might want to restrict its availability (i.e. we might allow course accounts, but not personal accounts).

Note that my "guestbook.phtml" script above writes to a file owned by me under my filespace, as me. If such facilities are required, this is preferable to having to have files owned by the web server userid (or, horrors, world-writeable). Note also how the current scheme allows the .php files to be readable only by the user who owns them, and not even by the web server userid. These are more advantages of not using built-in php, but using php as a CGI processor in the fashion we are doing.

See also:

[^] Back to top

Created by d3wilkin
Last updated 2007-03-19 by ARPepper
www/scripts_php.faq


10. Hey, my PHP scripts do not work any more

Why did my PHP scripts, which used to work, recently stop working?

The answer is that the PHP developers changed the way you can access information passed in to your php program. Whereas prior to PHP 4.2.0, many variables were readily accessible directly (including names derived from POST, GET, or COOKIE data, you must instead now access those values from special arrays.

This is most easily explained using the example from http://www.student.cs.uwaterloo.ca/~arpepper/phtml/guestbook.phtml .

Here is the old code.

<html><head><title>Sample Guestbook</title></head>
<body>
<h1>Welcome to a Guest Book</h1>
<h2>Enter your note below</h2>
<form action="<?echo $PHP_SELF?>" method="POST">
<textarea cols=40 rows=5 name=note wrap=virtual></textarea>
<input type=submit value="Send">
</form>
<?if(isset($note)) {
	$fp=fopen("/u/arpepper/guests.txt","w");
	fputs($fp,nl2br($note).'<br>');
	fclose($fp);
}
?><h2>The entries so far:</h2>
<? @ReadFile("/u/arpepper/guests.txt") ?>
</body></html>

And here is a version modified so that it will work with the recent PHP changes.

<html><head><title>Sample Guestbook</title></head>
<body>
<h1>Welcome to a Guest Book</h1>
<h2>Enter your note below</h2>
<!--
  Prior to php 4.2.x $PHP_SELF, and any GET/POST variables were predefined

  It is now necessary to instead use the appropriate entries in
  _GET, _POST, or _SERVER.
-->
<?if(isset($_SERVER['PHP_SELF'])) {
	$PHP_SELF = $_SERVER['PHP_SELF'];
}
?>
<form action="<?echo $PHP_SELF?>" method="POST">
<textarea cols=40 rows=5 name=note wrap=virtual></textarea>
<input type=submit value="Send">
</form>
<?if(isset($_GET['note'])) {
	$note = $_GET['note'];
}
?>
<?if(isset($_POST['note'])) {
	$note = $_POST['note'];
}
?>
<?if(isset($note)) {
	$fp=fopen("/u/arpepper/guests.txt","w");
	fputs($fp,nl2br($note).'<br>');
	fclose($fp);
}
?><h2>The entries so far:</h2>
<? @ReadFile("/u/arpepper/guests.txt") ?>
</body></html>

As you can see, it has changed so that the variable name 'note' which used to be set automatically is instead conditionally extracted from the GET or POST data. It is less obvious that the variable 'PHP_SELF' was extracted from the program environment variables, but it was, and so its value must be found in the _SERVER array, as shown.

For details about PHP syntax and semantics, see appropriate sections at http://www.php.net/. Be warned that some of those pages are actually message boards and people occasionally say confusing, and even incorrect things.

See also:

[^] Back to top

Last updated 2006-05-23 by arpepper
www/scripts_php_changes.faq


11. Removing carriage-return characters from a file (UNIX)

I got a text file from another operating system, but when I uploaded it to UNIX I encountered various problems because each line has a carriage-return before the linefeed.

This can happen if, for example, you download a package of scripts (PHP perhaps) from another source and attempt to use them on a UNIX system. If you download them indirectly (through your non-UNIX PC perhaps) that might cause the problem, but it can sometimes happen even without that. The carriage return characters can cause various problems.

Usually simply deleting the carriage returns makes the file work properly. There are an arbitrary number of ways to do this, but, since most machines students have access to are Solaris machines, we will point you to the Solaris "dos2unix".

The following UNIX command will provide some documentation of the command.

      man dos2unix
Suppose you have a file "script.php" which needs converting, then you might do something like the following.
    dos2unix script.php x.tmp
    cp x.tmp script.php

There is also an analogous command "unix2dos" which may be useful if you want to be able to read files you originally created on UNIX with certain text editors on certain other Operating Systems.

See also:

Appears in other topics:

[^] Back to top

Last updated 2005-03-16 by arpepper
edit/transform_delcr.faq


12. Access Control by Hostname/Domain

How do I restrict webpage access by domain/hostname?

For a definitive guide, see http://www.student.math.uwaterloo.ca/Server/mod/mod_access.html

The following .htaccess file, placed in the directory to be protected, seems somewhat logical, and does (I tested) deny access to normal attempts to access it from outside the math.uwaterloo.ca domain:

	AuthType Basic

	order deny,allow
	deny from all
	allow from math.uwaterloo.ca

Access control of pages is independent of CGI's. The content of a .htaccess file will never prevent cgiwrap CGI's from being executed. It can, however, prevent the source from being accessed as a text file.

Environment variable settings can allow a CGI to infer the name of a machine requesting it, and effect hostname restrictions that way.

Don't forget to set the permissions on the .htaccess file so that it is world-readable. (This is a requirement of the Apache Web server software run on most Math faculty UNIX Web servers).

See also:

[^] Back to top

Last updated 2000-05-01 by arpepper
www/features_access_byhost.faq


13. Access Control by Userids(Names) and Passwords

How do I set up userid/password access to a webpage?

For a definitive guide, see http://www.student.math.uwaterloo.ca/Server/mod/mod_auth.html

The following .htaccess file, placed in the directory to be protected, seems somewhat logical, and does (I tested) deny access to at least simple incorrect username/password cases:

	AuthType Basic
	AuthName "My Test"
	AuthUserFile /u/arpepper/support_html/private/.htpasswd
	AuthGroupFile /u/arpepper/support_html/private/.htgroup

	require group mygroup

By implication, the user must create the .htpasswd and .htgroup files, putting them where they want, usually in their file space, but not in their web space. The exact pathnames above are only examples; the names can be almost entirely arbitrary. Note that these files must, perhaps unfortunately, have their permissions set to allow anyone on the web server machine to read them. The .htaccess file must have its permissions set the same way.

Access control of pages is independent of CGI's, and the method described here is fairly (though not perfectly) safe, although all passwords will pass as hashed (theoretically decryptable) text. The content of a .htaccess file will never prevent cgiwrap CGI's from being executed. It can, however, prevent the source from being accessed as a text file via the Web. To get access control to CGI's, lines must be added to the .htaccess file to define the CGI's. It is outside the scope of this section to describe that in more detail.

"users" (arbitrary names, unrelated to the UNIX userids on the machine, even if some of them are the same) and encrypted passwords would be listed in the .htpasswd file, and the subset of users to be allowed access would be listed as in group "mygroup" (according to the example) in the .htgroup file. Essentially then, "user/password" pairs become a double password to the page.

Here is a sample .htpasswd file:

        guest:MPWhrgNG2bh1I
        henry:p6GJaBjI8dEwo
        fred:pv4HfS/8bAw/E
(There should be no whitespace to the left of the lines, however).

Here is a sample corresponding .htgroup file:

        mygroup: fred henry
        guestgroup: guest
(Once again should be no whitespace to the left of the lines).

That combination would allow pseudo-users "fred" and "henry" access to the pages, but not "guest". (A different page might want to use the same files, but allow access to a different group).

The program "htpasswd" (currently /software/www_server/bin/htpasswd on www.student.math or www.math) can be used to create new users and passwords, or update passwords. This program is typically only available on the actual Web server machine itself, since it is bundled with all the Web server software, so you must login to that actual Web server machine or you probably will not find the command.

Don't forget to set the permissions on the files so that they are world-readable.

See also:

[^] Back to top

Last updated 2000-11-16 by arpepper
www/features_access_byuser.faq


14. Access Control by UWdir (Quest) userid/password

Can I arrange that users must use their UWdir (Quest) password to access my pages?

Yes. Add the following in the proper section of your .htaccess file:

PerlAuthenHandler Apache::AuthenURL
require valid-user
SSLRequireSSL
The first line indicates that UWdir authentication will be used, if authentication is used. The second line indicates that authentication will be used. The final line forces people to access your file or CGI through https rather than http, which is necessary in order to protect the userids and passwords as they go over the network. Please note that this means the URLs for your protected content will start with https:// rather than the usual http://.

The above directives could go directly in a .htaccess file, in which case the authentication would apply to everything in the same directory and in its subdirectories, or they could go in a <Files> section, in which case they would apply only to the designated files.

The above could further be modified to use a group file to allow only a subset of users to access the pages.

Don't forget to set the permissions on the .htaccess file so that it is world-readable. (This is a requirement of the Apache Web server software run on most Math faculty UNIX Web servers).

See also:

[^] Back to top

Last updated 2004-01-20 by ijmorlan
www/features_access_imap.faq


15. Access Statistics

How do I determine how many times my pages are accessed ?

Look at either:

        http://YourHostnameHere/Stats.html
        

or (equivalently) at the file:

        /software/www_server/logs/wwwstat/wwwstats.html
        

or, for raw details for the current month, at the file:

        /software/www_server/logs/access_log
        

Make sure that you are on a host that has a WWW server running. For the math student environment the host is www.student.math.

If you look at the file, you'll likely have to look for things of the form:

        /%7EYourUserid/PathUnderPublicHtml
        

E.g. the file

        ~/public_html/my/stuff.html
		

would appear in the log as

        /%7EYourUserid/my/stuff.html
		

It's also possible to use a "page counter", but this is quite resource consumptive, and thus should be avoided on anything that isn't a private workstation.

[^] Back to top

Last updated 1998-04-20 by javoskamp
www/features_pagecount.faq


16. Making gzipped Files Available on the Web

How can we make gzipped documents (e.g. PostScript files) available on the web so browsers will handle them properly?

For the foreseeable future, you need to override the default headers produced by Unix Apache Web Servers at UW by putting appropriate directives in your .htaccess file.

The following .htaccess file, placed in the directory to contain the documents, will cause the Apache Server to send appropriate headers to indicate the appropriate type and encoding for PostScript (.ps) and HTML (.html) files:

     <Files *.html.gz>
     AddEncoding x-gzip   .gz
     AddType text/html .gz
     </Files>
     <Files *.ps.gz>
     AddEncoding x-gzip   .gz
     AddType application/postscript .gz
     </Files>

The trick (kludge?) is to use a "Files" pattern to restrict the domain of AddEncoding and AddType directives. (A problem being that doubly-dotted extensions are not recognized the way one would like in AddEncoding statements).

If you needed to post gzipped versions of other file types, you could probably determine the necessary value for the "AddType" directive by using the following command:

      lynx -dump -head URL
For example:
      % lynx -dump -head http://db.uwaterloo.ca/~arpepper/presentation.ps
      HTTP/1.1 200 OK
      Date: Fri, 14 Nov 2003 22:48:08 GMT
      Server: Apache/1.3.28 Ben-SSL/1.48 (Unix) mod_perl/1.21
      Last-Modified: Tue, 15 Jan 2002 01:10:14 GMT
      ETag: "b01728-ba2c-3c438176"
      Accept-Ranges: bytes
      Content-Length: 47660
      Connection: close
      Content-Type: application/postscript

 
      %
is how the necessary "Content-type" for .ps files could have been determined.


The bad news is that, unless unusually configured, the Microsoft Internet Explorer web-browser doesn't seem to deal with the setup properly. However, if you are willing to risk possible confusion by renaming your compressed files so they do not have the ".gz" extension, then the following can be used in your .htaccess file:

    <Files *.ps>
    AddEncoding x-gzip   .ps
    AddType application/postscript .ps
    </Files>

Assuming the user's browsing machine has some application indicated as handling ".ps" files, then the above should cause the Microsoft Internet Explorer web-browser to open the downloaded files with that application (as opposed to the application the machine thinks is assocated with ".gz" files, which will be inappropriate).

In fact, the above "AddType" directives will be redundant for most file types, and so this scheme can also relieve you of the responsibility for determining the appropriate type-string to use.

It would probably be a good idea to use a separate directory to house the compressed postscript files. The name of that directory should remind users of various types that its contents are compressed, and must be compressed, but without .gz extensions. E.g. "/gz/" perhaps.

See also:

[^] Back to top

Last updated 2003-11-14 by ARPepper
www/features_ps_gz.faq


17. Student Web Server Upgrade to Apache 1.3

Why did my web pages break when Web Server Software was upgraded from Apache 1.2.6 to Apache 1.3?

This problem will actually affect all Apache 1.3 servers on campus, not just the student.math web server, and applies equally as much when creating new CGI's, too.

There seems to have been an additional security restriction put on cgi scripts in Apache 1.3--they may not be in a directory which is group-writeable. If they are, an error occurs, and the script does not run properly. (Typically, you will get "Internal Server Error" in your browser). A message indicating this problem will also appear in the following file on the machine which runs the Web server:

   /software/www_server/logs/cgi.log

Right now, the only work-around seems to be to remove the group write permission from the directory. I.e. it must say something like:

mef07.math% ls -ld scripts
drwxr-xr-x   4 cs130    cs130       4096 Sep 14 09:23 scripts
not:
mef07.math% ls -ld scripts
drwxrwxr-x   4 cs130    cs130       4096 Sep 14 09:23 scripts

In addition to the above restriction, CGI's themselves must be owned by the logical owner of the personal pages containing them, and must not be group-writeable. Although it would appear to follow logically, I'll also mention explicitly that both the CGI itself and the directory containing it must not be world-writeable, either.

See also:

[^] Back to top

Last updated 2001-11-01 by arpepper
www/scripts_apache13.faq


18. Putting Private Data on the WWW

New Private File Web Capability

Effective March 1, 2004, a new capability is available on the main Math Faculty (including CS) web servers www.math, www.cs, www.student.math, and www.student.cs. It is now feasible to put private files on the Web in such a way that only authorized users can see them.

Normal Apache web server configuration techniques involving .htaccess files suffice to keep files hidden from Web users, other than authorized ones, but in order for the Web server itself to be able to access the files in order to serve them to users they must be world-readable. This means that any user with login access to the Web server can read the files as well. On the student system this means almost any student or faculty member; on the general system this means almost any graduate student or faculty member.

As of March 1, 2004 the main Web servers are run as user www, in Unix group www. This means that files to be served on the Web need only be group www and group readable in order to be accessed by the Web server. Any file or directory which is meant to have distribution restricted should have world-read removed from it.

The www-privatize Command

The www-privatize command provides a way for users to make files visible to the Web server software will simultaneously hiding them from other users logged on to the Web server or another host in the same region. Note that this does not, by itself, impose any HTTP access controls on the files, so any web user from anywhere in the world can, by default, still see the files.

Type www-privatize by itself to get a brief summary of the command format. There are two options:

After these options (possibly none), you can put as many file and directory names as you want. Each specified file and directory will be made visible to the web server software and invisible to other users. Some important notes:

Questions & Comments

Please contact Isaac Morland (ijmorlan@uwaterloo.ca) with any questions or comments. I would be happy to help anybody use the new private files capability.

See also:

Appears in other topics:

[^] Back to top

Last updated 2005-06-21 by w22li
www/www_private_data.faq


consultant@math.uwaterloo.ca