Hebrew on the Web

| 2 Comments
I've been researching Hebrew on the web for several months. A friend of mine at Hebrew College asked me to look at several URLs and figure out what he could do to put things online that would be equally accessible under Hebrew-enabled Macs or PCs (or, for that matter, Linux, Unix, whatever). As folks who have done this for years know, this is messy. There are two general standards, the Windows way (charset=Windows-1255), and the supposedly standard way, (charset=iso-8859-8). If you are encoding your pages straight UTF-8, you also take advantage of Unicode. Last year I did some tests with my friend Jack Woehr and we discovered that if you really write Unicode, Hebrew displays fine on Mac and PC using utf-8. This year I got a quick project to get some Hebrew up on the web for "we are the future" and jumped in to see if I could find something simple. The results mostly work on PC, but there are some issues on the Mac, under OS X, using Safari.
The main text flows correctly - if you try to read this using Safari under OS X, you won't at first notice any problems. Then, you'll note that the ambiguous characters (glyphs such as punctuation that could be placed differently depending on the language the browser things is the base for the current paragraph). Bidirectionality is a messy subject. The problem is that when you tell a browser that you are using UTF-8, for instance, it is easy for the browser not to be sure where to put punctuation marks: if the base language were English, say, then periods go on one side of the sentence display. If the base language is Hebrew, the opposite is true. On the pages I did for "We are the future" (see, for instance, www.wearethefuture.com/he/concert_about.html, everything looks fine on a PC using IE or Mozilla. But, fire up OS X and take a look with Safari and you see hyphens and periods that are placed wrong. Not fatal - at least the text flows nicely from right to left as it should - but not what I want. Here is what I used:
  • In the head of each document, I noted utf-8: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  • Paragraphs got styled thus: <p align="right" lang="HE" dir="RTL">. (Normally I'd put this in a style sheet, but this was a drop-in and I didn't want to isolate the pages I worked on from the general style sheet and any global changes. And the job was too fast to comfortably isolate the items in a local style sheet.)
  • The Hebrew characters were encoded using some Microsoft? characterset, using the range (aleph to taf) &1488; through &1514;. I got this by saving a Word document containing Hebrew to HTML. I used Word 2000, under Windows NT, with Hebrew resources installed. (I don't think that what was saved is Unicode, which I recall having a different hex offset that is much larger - but I'm new to this, I could be entirely wrong.)
Microsoft puts charset=Windows-1255 when saving the word doc as HTML. This is the near-universal set used in Israel. The thing is, I do feel that I don't want to be relying on browsers on non-Windows platforms simply figuring out how to accomodate Windows. I'd like something universal. The goal is to stop having to worry/think about platform. It doesn't matter in this case. Putting "charset=windows-1255" at the top of the file changes nothing compared to utf-8. Additional notes: I got very curious and tried to view this using IE 5 and Mozilla 1.2 under OS 9.x on the Mac. IE displayed nothing but question marks, regardless of whether the meta tag indicated utf-8 or windows-1255. Mozilla looked perfect, just as on the PC. So, I did the obvious and downloaded the OS X version of Mozilla. Still perfection. The problem is Safari, not my encoding. I'm willing to bet to bet that what I did is still not the optimal way to do this. It worked well enough for this project (pending some final fussing to see if I can figure out how to encode things so that Safari groks the language and base direction correctly - of course, but that depends on finding a workaround Safari's bugginess), and it feels great to have some Hebrew web pages up. Next try: do it better. First path to explore? Convert these character encoding to actual Unicode and see if that works better.

2 Comments

שלום

your page www.wearethefuture.com/he/concert_about.html, looks great on my safari 2.0 under Mac OS 10.4 ie Tiger with text encoding to utf-8

patrick iglesias-zemmour
iglesias@math.huji.ac.il

Thanks!

About this Entry

This page contains a single entry by Ari Davidow published on May 21, 2004 11:26 PM.

Ari Davidow, Hebrew Typographer, Extraordinaire was the previous entry in this blog.

Yiddish from the National Yiddish Book Center is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.