Simple Search Engine Optimization
This is a simple trick that will get your pages a higher ranking in the search engines.
Note that this is not guaranteed to give you the top position in the search
engines nor is it guaranteed to keep on working -- personally I envisage that at some point
in the near future search engines will include a facility to discard such pages and not rank
them. Also, this isn't again guaranteed to get you any clicks -- users are getting selective with
their searches and can discard sites that just advertise certain keywords rather than providing
content related to those keywords. In other words this is not guaranteed at all to get you clicks
or a even a higher ranking, but it is a simple trick you can try to use it to bump up your
ranking in the search engines.
The whole idea with the search engines is that they analyze the contents of your pages
and extract certain keywords that are considered relevant, then count the number of occurences
of these keywords and based on this probably store a certain ranking against your page and the keywords.
Based on these rankings, when a user searches for some of the keywords that the search engine has
found in your page, your page will appear in the list at a certain position. If your ranking
is quite low, you will be less likely to get visitors on your page, as your page listing might
not appear on the first page of search results -- or it might appear on the first page but at the bottom.
(Typically, you will find out that users stop at the first 3-4-5 results.)
Therefore, one simple idea to bump up your page ranking is to include more of the keywords that
your page is "talking" about in the page itself. This way, the search engine will detect more
occurences of these keywords and rank it higher in its results listings. The problem is, if you
overload your page with say 1000 times the word it is very unlikely that there will be space for the
actual content or it will look very unpleasant and/or unprofessional. (Imagine a page that ends with
about a dozen paragraphs with just the word "computer"!)
To work around this aspect, you can use some clever CSS -- simply add a <span>
with absolute positioning (so it doesn't affect the layout of all the other components on the page)
and with visibility set to hidden (so it doesn't actually show on the page and irritate the user) and
in that span place all the keywords you want to be ranked higher in the search engines.
Below is an example on how to bump up the ranking of the phrase "the quick lazy fox":
<span style="position: absolute; left: 0px; top: 0px; visibility: hidden">
the quick lazy fox
the quick lazy fox
the quick lazy fox
the quick lazy fox
... keep repeating this as much as you like
</span>
In fact, at the time this was written the page included such a span for this
phrase. If you view the source, you will notice it at the bottom of the page. I will
be updating this page soon after the search engines have indexed this page with news
regarding the position it got in the search engines.
You can use the buttons below to show or hide the span. (Please note
that you will have to scroll at the top of the page to see it.)
I've also included buttons to search Yahoo and Google based on the phrase the quick lazy fox;
clicking on these buttons will open up Yahoo or Google with this search phrase and hopefully
you should see this page at the top of the search results -- fingers crossed! :-)
One last comment about this trick: obviously, the more time you repeat the phrase in
the page, the bigger the file size becomes and the page will take longer to load. So it
is wise to use this trick moderatly.
Back to the beginning of the page
|
Optimize the Download Speed of a Web Page: Whitespaces
When building a webpage, I personally prefer to work at HTML level -- which means I don't
rely on tools like FrontPage and such. While this gives one more control over the webpage
HTML, it also means that in most cases you have to structure your HTML such that it is easy
to read, understand (and occasionally debug). Typically, this is achieved via indentation, so you
can see at a glance the hierarchy of the HTML elements.
That's all good -- for the writer/designer. However, this indentation has a downside (like everything
else) and that is that it slows down the page loading time. Obviously, more spaces in the HTML
means an increase in the file size -- and an increase in the file size means an increase in the
download time -- says the Department of Bleeding Obvious :-)
Agreed, most web servers and web browsers will negociate the transfer to be gzip-encoded
so that the page is compressed before sending and then decompressed by the browser before rendering.
While this indeed decreases the download time (by adding a little extra load on the server and browser)
it doesn't eliminate the fact that your pages are still slighty bigger than they could be without the
the extra whitespaces -- it only makes it less obvious.
Here's a few tiny hints on where you should look to optimize your page sizes:
Group More Elements on a Single Line
Consider the following code:
<table>
<tr>
<td>contents goes here</td>
</tr>
</table>
The code is very nicely structured, such that one can see that the td element
is inside the tr element which is inside the table element. However,
the browser doesn't really care whether the HTML is nicely laid out on separate lines and
properly indented -- so straight away one can spot that we can get rid of (at least) 6 whitespaces
by grouping the table and tr elements on a single line (thus removing
the newline character and the space), stripping out any whitespaces before the td
element and also grouping the closing tr and table elements. This way,
we still keep a little "visual" hierarchy (the td being on its own line) while saving
4 spaces + 2 newline characters (and need I remind you that on Windows systems a newline character
is really 2 characters : newline + line feed!
So this is the same code, structured as described above:
<table><tr>
<td>contents goes here</td>
</tr></table>
Of course, one can go even further and actually group everything on a single line -- but I
found that to be a bit confusing, especially when there's more than one td element
(table column) inside the tr (table row) element; at least if I keep each column
on its own line I can easily see just by scanning the HTML source quickly how many columns does
the table have.
Use Tabs rather than Spaces
If you decide to still keep indentation for your HTML source, use tabs rather than spaces:
mostly because an editor would "render" a tab character as the equivalent of 4 spaces -- so it
makes it more clear to see the indentation, and also because a tab character is just one character!
This means that if trying to achieve the same indentation by using spaces, one would have to
actually use 4 space characters for each tab character -- so for each indent you waste 3 characters!
Tidy the HTML after Publishing
So you like to see your HTML nicely indented at the time you write your article -- fine. But
once published, it's unlikely you would go back and re-edit the page (or at least it's probably
safe to assume that you will not edit the page that often). So this means that once your article
is published you could then run an automated task to strip out the whitespaces.
I personally recommend using Tidy by Dave Ruggett
(link opens in a new window) as it does in fact strip out the
spaces. (However, what it doesn't seem to do is strip out also the newline/line feed characters,
so you will still have to group more than one items on a line.) On top of that, it also validates
that your HTML page is W3C-compliant -- and if not it can fix some of these errors.
You can use it with the following parameters, after I've edited the file and right before
publishing it to the webserver (assuming you have just changed the file called index.shtml):
tidy -o index.shtml index.shtml
The above command will overwrite the index.shtml after the HTML has been tidied up
and the whitespaces have been removed.
Another way of doing it, is to have tidy run recursively against all of the files on
your webserver and clean out the whitespaces regularly. If you run this task say every night (or
whenever your webserver is at its lowest load -- so you don't interfere with the incoming
web traffic) then you can publish your files with complete spaces/tabs/indentation and rest assured
that tidy will strip out the spaces every day. In other words, once published, in maximum
24 hours, your HTML page would be cleaned and stripped of unneeded whitespaces.
There is of course a problem with this approach though: if you rely on certain HTML hacks (normally
due to old legacy code), then you might find out that tidy will fix those hacks for
your, thus potentially breaking your page layout. However, if your code is W3C-compliant (be it HTML
Transitional or XHTML etc), you probably won't have to worry about that.
As an example, here is the first page in the "Web Programming" section
as it is published on the site (including whitespaces) and this
is the same page after it has been run against tidy. You will notice that about
200 bytes have been stripped out (and that is a small page) -- and you could take this even
further, because, as I said, the page doesn't group more than one item on a line etc.
This is the same page after we cleaned it up with tidy
and also grouped some more elements together; the end result: a difference of 400 bytes in between
the original and the last one! If this page is requested say 100 times a day then you have
save yourself 40K each day (so about 1.2Meg per month) worth of traffic. (Obviously, if you use
gzip-encoding then you should probably half those numbers, however, it's worth mentioning as well
that you will be decreasing the monthly traffic by 600Kb but also decrease the load on server
as the gzip-ing will be done against smaller content each time!)
Back to the beginning of the page
|