[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: "Martin J. Duerst" <mduerst@ifi.unizh.ch>*Subject*: Re: Unicode and math symbols*From*: Chris Rowley <C.A.Rowley@open.ac.uk>*Date*: Mon, 24 Feb 1997 19:12:31 GMT*Cc*: bkph@ai.mit.edu, tex-font@math.utah.edu*Flags*: 000000000000*In-Reply-To*: <Pine.SUN.3.95q.970224125523.245C-100000@enoshima>*References*: <199702221511.PAA14393@fell.open.ac.uk> <Pine.SUN.3.95q.970224125523.245C-100000@enoshima>

Martin wrote -- > > The "rubish" parts are usually due to backwards compatibility issues. > I.e. there is some national standard or some industry or company > encoding that contains these things. Which may well also explain why the set of math symbols is so bizarre. > In general, I agree with Chris that for systematic form changes > in the alphabet, additional information (such as font information > on a lower level, or structural information on a higher level) > should be used. On the other hand, if there is a well-used > Math symbol that isn't in UNicode, I would suggest to make > a formal proposal for putting it in, with all the necessary > data. What makes it well-used;? If you look at something like formal methods or logic, you find all sorts of symbols and the number increases rapidly: are these "well-used"? are they "maths"? The general problem is that mathematical notation is by its nature not standardised, in either form or meaning. > One thing not really clear in the Math area is the distinction > between semantics and abstract form. And also the relationship between them. Please do not let Unicode become caught up in the problem of expressing the semantics of mathematical notation. The only semantics that Unicode gives to 0061 is a standard name and the fact that most people expect something with that name to look like "a" or "the rounder form used in some fonts"; it does not say that "when used in English as the only letter in a word it is the definite article", nor should it. So please leave the meaning of math symbols (which is also highly context-dependent) to the mathematical reader. Another reason for keeping such discussions out of the Unicode area right now is that a lot of effort is going into decding what can and should be standardised at the DTD level (in particular HTML-math). I think that this fits in with bb's comment that there are standard SGML public-entities for math notation and with the way users are used to encoding maths: at least for the moment it should be the only place where we try to standardise any sort of semantics. One reason for this is that the natural structure of even quite simple typeset maths is visually much more complex than the Unicode model (for Latin-based systems) of "base+diacritics" and it is not closely related to the more complex visual structure of other writing systems. It may well be possible to extend Unicode to cope with this but again this woul only cover some math notation, never all of it, so it does not seem to me to be a worthwhile activity, at least not right now. > For example, should there > be one codepoint for "set difference", and this could look > like "-" or like "\" or whatever, depending on the font (and > maybe other setting), No, that is not a font-dependent thing (at least I would shoot any editor who decide it was:-). I can also easily find places where both would be needed (for very similar operations that had to be distinguished in a certain context) or inded 3 or more similar things (at some stage one would stop using different symbols and instead use (TeX notaion): \mathbin{\setminus_n} ...in other words, I think that this is the wrong question. > while there is another "-" for subtraction, > one for hyphen, and so on, or should there be one and the same > "-" for various purposes, and one and the same "\" for various > purposes, and the slight differences in shape, size, and placement > be dealt with depending on circumstances (e.g. Math or not). This is a much better question: one reason is that I suspect that "minus" needs to be used in places that really do not require to be labelled "language=math" (but some would argue with this). And if something that should look like a minus sign is used outside math then it would be a great blessing if the right symbol was used. Within maths, based on my opinion above, I think that the canonical form should be an SGML entity called "minus" (or possibly two: one called "unary-minus" and one called "binary-minus", but not one called "set-minus"; and here we get dangerously close to the question: how much of the mathematical semantics should/can be encoded at any level). This would not preclude a unicode character called "minus" appearing within math, but it means that when it appears there it is a short-form for an entity (in the abstract model, I am not saying that SGML's contorted syntax will allow this). > I guess we certainly need some amount of both (the later e.g. > to distinguish between a hyphen and a dash), but in general, > on the level of character encoding, it's easier for most > people to deal with "one shape, one code", and so that will > probably prevail in the long run. How much longer must a dash become before its shape changes?:-) Yes, I agree with you: let pragmatism rule; which is why I asked some time ago (maybe I missed the answer?): What are the practical benefits of having some set of mathemtical symbols in Unicode? Is it the canonical name that is important? Or assigning a standard code-value to that name, or both? chris

**Follow-Ups**:**Re: Unicode and math symbols***From:*"Martin J. Duerst" <mduerst@ifi.unizh.ch>

**References**:**Re: Unicode and math symbols***From:*Chris Rowley <C.A.Rowley@open.ac.uk>

**Re: Unicode and math symbols***From:*"Martin J. Duerst" <mduerst@ifi.unizh.ch>

- Prev by Date:
**Re: Unicode and math symbols** - Next by Date:
**composite characters and dotless ones** - Prev by thread:
**Re: Unicode and math symbols** - Next by thread:
**Re: Unicode and math symbols** - Index(es):