Talk:Byte

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Computing / Software / CompSci / Early / Hardware (Rated C-class, Top-importance)
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Top  This article has been rated as Top-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Software (marked as High-importance).
Taskforce icon
This article is supported by WikiProject Computer science (marked as High-importance).
Taskforce icon
This article is supported by Early computers task force (marked as High-importance).
Taskforce icon
This article is supported by Computer hardware task force (marked as High-importance).
 

Decibel is not an SI unit[edit]

In the section titled "Unit Symbol" there is an entire paragraph explaining that the symbol 'B' is the SI unit of the bel. This is not true.

Although often used with SI prefixes - e.g. decibel(dB) - the bel itself is not, nor can it ever be an SI unit itself, it is a dimensionless ratio. Because it is dimensionless, it is often necessary to indicate how it was calculated by adding an appropriate suffix (e.g. dBi, dBm) in order to make meaningful comparisons.

http://physics.nist.gov/cuu/Units/outside.html

JNBBoytjie (talk) 10:38, 13 April 2016 (UTC)

Thank you. Done. Kbrose (talk) 14:17, 13 April 2016 (UTC)

C 'char' type[edit]

AFAICS 'unsigned char' is right. Based on the C Standard, (signed) char need only hold values between -127 and 127 inclusive, in other words only 255 distinct values. If you want a guarantee of 256 distinct values you need unsigned char. Ewx (talk) 08:04, 25 August 2016 (UTC)

It is not necessary to specify unsigned char. The C standard already mandates that a value stored in a char is guaranteed to be non-negative. A signed char is an integer type and has to be declared, not an unsigned char. Kbrose (talk) 11:50, 25 August 2016 (UTC)
I have two problems with this concept....:
What happened to -128? (0x80)
What version of the C standard? The original compiler I was using in the 80s (Turbo-C 1.0, 1.5 and 2.0) had by default 'char' being signed. But you could configure the compiler to make it unsigned by default. That was obviously before the current C standard... Dhrm77 (talk) 13:51, 25 August 2016 (UTC)
Formally speaking there is only one C standard, ISO/IEC 9899:2011; the rest have been withdrawn, as you can see on the ISO website. That doesn't stop people referring to older revisions (or drafts, given the excessive cost of the current version), or implementing them, though. In this case however, the question is irrelevant: all versions of the C standard permit char to be either a signed or an unsigned type. As for 'what happened to -128', the point is to permit a variety of representations of signed types; there's more to the world than x86 and two's complement. Ewx (talk) 08:05, 26 August 2016 (UTC)
Which C standard? Is the specification for char the same in C89, C90, C99 and C11? If not, the article should reflect that. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:45, 25 August 2016 (UTC)
No, char is not guaranteed to be unsigned. C99 and n1570 are completely explicit about this (6.2.5#15). Ewx (talk) 07:57, 26 August 2016 (UTC)

For the record, this is the text of 6.2.5#3:

An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.

Kbrose (talk) 11:57, 26 August 2016 (UTC)

You're reading the wrong bit. That just tells you that certain characters have non-negative representation in a char. It does not tell you that the type itself is unsigned. Once again, see 6.2.5#15 for text that is actually relevant here. Ewx (talk) 18:50, 26 August 2016 (UTC)
I have to agree with Ewx, I think it just says that if you need to store non-negative values, you can, if you want to store something else, it depends on the implementation, which is a way of saying that they are not defining if a char is signed or unsigned. It even opens the door to implementing a range of -64 to +191 if you wanted to, instead of the classic -128 to +127 or 0 to 255. Dhrm77 (talk) 22:02, 26 August 2016 (UTC)
-64 to 191 would be forbidden (SCHAR_MIN must be at most -127) but -127 to 127 is permitted (and realistic for 1s complement machines); so is -2147483648 to 2147483647 or 0 to 4294967295 (and realistic for word-addressed machines).Ewx (talk) 07:25, 27 August 2016 (UTC)

Merge Octet (computing) into Byte[edit]

Octet (computing) should be merged into Byte as Octlet (computing) is another name for Byte and Wikipedia does not have two articles for two names of the same thing (instead both are mentioned in the WP:LEAD and the article's title is at the WP:COMMONNAME). -KAP03(Talk • Contributions • Email) 22:47, 26 March 2017 (UTC)

  • Oppose. I see your point, but I'm afraid that merging may blur the differences between the terms even more (than how they are confused in the current articles). Byte, Octet, Octlet (and the not mentioned Octad) are related but not multiple names for the same thing in general. They need to be distinguished carefully and we should rather improve/sharpen the articles emphasizing their differences.
While byte is today understood as referring to a group of 8 bits most often, without context it does not define a specific count of bits. Historically, a byte was defined as any group of bits from 1 to 6 (with 5 and 6 bit being the most commonly used forms even at that time). Later it was defined as the group of bits necessary to hold a character, that is 5 to 8 bits. With the advent of micro-computers in the late 1970s / early 1980s, this shifted towards meaning 8 bits by default. Therefore, byte is a platform-specific term.
Octet, however, was specifically defined to avoid the ambiguity of the term byte, and always means 8 continous bits, regardless of context and platform. That's why octet (rather than byte) is the term used in formal definitions of f.e. network protocols, in the telecommunication industry, etc.
Octad is a term similar to octet, however, it has fallen into disuse in recent decades and is not in common use any more. Like octet, it specifically means 8 bits as well, however, it looks at them from the angle of how many bits are necessary to define 129 to 256 states in coding (at least this is what I draw from the usage of similar terms like tetrads and pseudo-tetrads). Looking from that angle it appears as being don't care if those 8 bits holding the state are grouped together physically.
Octlet (per IEEE 1754) means 8 octets or 64 bits, so it is clearly different from octets.
--Matthiaspaul (talk) 12:31, 27 March 2017 (UTC)
  • oppose - No, they are not two names for the same thing, but names for different things. As a side note, the VFL instructions of Stretch could specify any byte size from 1 to 8 and 12 was a common byte size for CDC users. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:22, 28 March 2017 (UTC)
  • Oppose - There has been significant discussion of the distinction on the respective article talk pages. Bytes have not always been 8 bits. An octet is defined as 8 bits. Bytes are used in processors. Octets are used in communications. It is probably possible to cover both in a single article but that would not be a trivial merge and I'm no convinced that what we'd end up with would be an improvement over current coverage. If someone wants to create a sandbox version of a merged Byte article, I'd be happy to assess in more detail. ~Kvng (talk) 18:34, 9 April 2017 (UTC)
  • Support - I think the articles should be merged. It's true that byte has meant other things in the past, but the modern definition (according to the International System of Quantities) is 8 bits. Dondervogel 2 (talk) 18:50, 9 April 2017 (UTC)
  • Oppose - The terms don't always represent the same thing and are used for different purposes.Jko831 (talk) 19:37, 21 September 2017 (UTC)

Architectural support for byte sizes other than 8[edit]

Off the tope of my head, these machines come to mind as supporting byte sizes other than 8

  • CDC 3600 and 3800
    1 to 48 bits
  • DEC 36-bit machines
    1-36 bits
  • GE, Honeywell and Bull 36-bit machines
    6 or 9 bits
  • RCA 601
    3, 4, 6, 8 or 24 bits
  • UNIVAV and Unisys 36-bit machines
    6, 9, 12 or 18

Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:41, 29 March 2017 (UTC)

Shmuel, do you actually mean bytes in this context? I recall the larger sizes to be called words. If you can, please provide some refs, it would be great if we could track this down to historic sources in order to improve the article.
--Matthiaspaul (talk) 00:37, 30 March 2017 (UTC)
Yes, I actually mean byte and both CDC and DEC used the word byte as part of the instruction names, e.g., Deposit Byte. Two easy citations from bitsaver are
  • 3600 Computer System Reference Manual (PDF), CDC, October 1966, 60021300
  • Book1 Programming with the PDP-10 Instruction Set (PDF), PDP-10 System Reference Manual, CDC, August 1969, 60021300
Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:40, 3 April 2017 (UTC)
Another more recent example would be the Nintendo 64 with 9-bit bytes. 2003:71:CF10:FD00:A843:F00A:C1FE:7F1F (talk) 20:40, 23 September 2018 (UTC)

Status of error checking bits[edit]

The IBM 7030, for which the term byte was coined, did not include error checking bits as part of a byte. Nor did the DEC PDP-6, the CDC 3600, or any of the other computers with the ability to access bytes of various size. The System/360 Principles of Operation contains the text "Within certain units of the system, a bit-correction capability is provided by either appending additional check bits to a group of bytes or by converting the check bits of a group of bytes into an arrangement which provides for error checking and correction (ECC). The group of bytes associated with a single ECC code is called an ECC block. The number of bytes in an ECC block, and the manner in which the conversion or appending is accomplished depend on the type of unit involved and may vary among models." Accordingly, I call for the reinstatement of the text "The byte size designates only the data coding and excludes any parity or other error checking bits." Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:30, 21 June 2017 (UTC)

I removed that statement because the lead is supposed to summarize the important points in the article body. The statement I removed, (a) is not covered in the body (b) may not be one of the more important points about the topic. I am not at all opposed to including this information in the article body and once that is stable, we can consider it for inclusion in the lead. ~Kvng (talk) 14:51, 24 June 2017 (UTC)

"Octad"[edit]

Is the origin unclear?

The article currently states:

"The exact origin of the term is unclear, but it can be found in British, Dutch, and German sources of the 1960s and 1970s, and throughout the documentation of Philips mainframe computers."

Surely this is just the eighth member of the sequence which starts "monad", "dyad", "triad", ie a group of eight things (looking toward Greek). "Octet" and "octad" appear similar because the Latin and Greek cardinal number 8 both have the same form (octō, ὀκτώ). Compare e.g. "quintet" vs "pentad" for a group of 5.

Of course the correct term would be an ogdoad (from the genitive of the ordinal) but not everyone who wants to use precise, technical language also knows Greek.

If the question is about who first used the term in its computing sense, that may be unanswerable because it probably slipped in from an earlier technical or mathematical sense. –moogsi(blah) 23:46, 23 October 2018 (UTC)

… representing a binary number[edit]

oRLY?
Anybody having a programming experience—even amateur—knows that bytes more frequently do not represent numbers (serving as opcodes, parts of bitmaps or compressed data…) than do represent numbers explicitly. Even for such complicated number format as IEEE 754 it wouldn't be helpful to think of every isolated byte as of a sensible numerical value. Objections against complete removal? If any, then change to "are capable of representing a binary number" maybe? Incnis Mrsi (talk) 10:06, 27 July 2019 (UTC)