Welcome to MobyThreads.com!
FAQFAQ      ProfileProfile    Private MessagesPrivate Messages   Log inLog in
All support for the MobyThreads Threaded phpBB MOD can now be found on welsolutions at this forum

what char encoding are plain text files?

 
   Web Hosting and Web Master Forums (Home) -> Webmaster RSS
Next:  domain question  
Author Message
Proteus

External


Since: Sep 25, 2005
Posts: 8



(Msg. 1) Posted: Sun Sep 25, 2005 5:50 pm
Post subject: what char encoding are plain text files?
Archived from groups: alt>www>webmaster (more info?)

Can someone please explain to me what type of character encoding (Unicode
UTF8, ISO whatever, etc) plain vanilla text files are (I am talking text
files as created for example with a linux vi or gedit or vim editor, the
simplest text files)?

I have been running into problems with HTML files I made, uploaded to a
proprietary Content Management System (online campus software), that has
an online html editor; when I then download my html, it seems to be funked
somehow by the online software system so that when I try to look at my
downloaded html with linux 'less' command or the vi editor, I get a
warning that it is a binary file and all I see is gibberish (binary funky
characters) rather than the text based html tags. I can still open and
view the html in Mozilla Composer, and if I save it with e.g. Unicode UTF8
character encoding I can then see it with less command or the vi editor or
some other plain text editor. I can also open the funked html in
Openoffice, where I see the html source as tags, but just before the first
<HTML> tag there are two funky binary characters (a y with two dots over
it, followed by a vertical line with a backwards c attached to it); if I
delete those two funky characters, then save the file with OpenOffice, I
can then view the saved html with vi editor, etc.

Very odd, I do not understand what is going on. If anybody can enlighten
me I will be very greatful.

 >> Stay informed about: what char encoding are plain text files? 
Back to top
Login to vote
Proteus

External


Since: Sep 25, 2005
Posts: 8



(Msg. 2) Posted: Mon Sep 26, 2005 9:44 am
Post subject: Re: what char encoding are plain text files? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Mon, 26 Sep 2005 08:15:44 +0100, Toby Inkster wrote:
...
> That depends on what character encoding the files are in.
>..

Ok, fair answer. Then is there some utility or way to easily determine
what type of char encoding a text file is in? I mean, if I have
somefile.txt or somewebpage.html, how can I know what char encoding is
embedded in the file? Is there some utility (hopefully in linux) to look
at the type of encoding used?

 >> Stay informed about: what char encoding are plain text files? 
Back to top
Login to vote
Proteus

External


Since: Sep 25, 2005
Posts: 8



(Msg. 3) Posted: Mon Sep 26, 2005 11:31 am
Post subject: Re: what char encoding are plain text files? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Mon, 26 Sep 2005 16:51:33 +0100, Brian Wakem wrote:
...
> $ file ./*
> ./1123499855.671.doc: Microsoft Office Document
> ./domains: ASCII text
> ./mbox: ASCII mail text..

Interesting. For html docs though the file command just shows it as HTML,
no char encoding listed; not even if I rename the .html to .html.txt
But that is a nice utility to know.
 >> Stay informed about: what char encoding are plain text files? 
Back to top
Login to vote
Doc O'Leary

External


Since: Jul 06, 2005
Posts: 16



(Msg. 4) Posted: Mon Sep 26, 2005 2:43 pm
Post subject: Re: what char encoding are plain text files? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article <pan.2005.09.26.14.44.34.625071 DeleteThis @uselessemail.net>,
Proteus <proteus DeleteThis @uselessemail.net> wrote:

> Ok, fair answer. Then is there some utility or way to easily determine
> what type of char encoding a text file is in? I mean, if I have
> somefile.txt or somewebpage.html, how can I know what char encoding is
> embedded in the file? Is there some utility (hopefully in linux) to look
> at the type of encoding used?

No. Bits are just bits if there is no metadata that tells you the
encoding. In one text encoding a certain bit sequence might be a bullet
point and in another it might be the symbol for the British Pound. The
best a computer could do is the best a human can do: look at any
particular encoding and say it's *probably* wrong, but that doesn't get
you to what encoding is *definitely* right. You really need the author
to add that metadata if you want it to be clear.
 >> Stay informed about: what char encoding are plain text files? 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Free Search and Replace utility for text files - Following yet more requests, this time at the webmaster world forum, I thought it was time for a free, cross-platform, search and replace utility for all! So here it is: The following Perl program will conduct a search and replace on EVERY file in the....

how to return char 35 to new line without counting html tags - HI EVERYONE I have a string with html tags as follows <B>HELLO HOW ARE YOU </B> GOOD DATA <B>BYE</B> 1234567890123456789012345678901234567890123456 0 1 2 3 4 123456789012345678 90123456789 012 i ha...

DNS - numbers into plain language - Is there a quick convenient way to translate DNS numbers, 234.123.567.1 and the like, into www.whatever.com?

encoding - Can't find an answer anywhere. Can anyone help " = &quot; , = ?????? Regards Richard Grove http://redeyemedia.co.uk

Character encoding - Our web site has to eventually support i18n. I assume that, depending on which language, we would spit out a different character encoding. For now we are developing using english data/messages. Currently we are using UTF-8 character encoding. ..
   Web Hosting and Web Master Forums (Home) -> Webmaster All times are: Pacific Time (US & Canada) (change)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]