Hi,
I do hope we can get an understanding of the TPU... Perhaps some group think is in order. Let's discuss a few ideas. Here is what I came up with.
Recap:
- There is a section of code that loads 2048 bytes into a multiplexed internal RAM.
- The loader writes this section into the TPU as 1024 words.
- The loader writes four 16bit words and then increments an address pointer.
- I believe the section is written/used as two hundred and fifty six 64bit words.
* See the code below
I decided to try and validate my assumption by looking at the words to see if they are duplicated and at what interval. I decided to exclude the word 0x0000 because often times this is used as a default/null word. NoOps will often default to 0x0000 and zero all the other fields. My results exclude this.
In the case where the store is 16bit, I would expect to see duplication of the word be evenly spread over each word in the store. This is not what I found. In fact looking at the store as an array of 64bit words, each of the columns that represent bits 48:63 has only 66 unique codes. Codes within this column are duplicated 160 times (again excluding 0x0000). This contrasts with duplication in the other three columns of words which result in duplication 0, 1 and 3 times.
I then looked at the second column: bits 32:47. Here there are 217 unique codes. Once again, the duplication of codes in other columns was very low: 0, 5 and 6.
For bits 16:31, there were 115 unique codes and duplication in other columns: 2, 5 and 0
Last up bits 0:15, there were 124 unique codes. Duplication in other columns: 2, 22 and 27
===========
My thinking is that given that all the data paths in the TPU are 16 bit, there would be no reason to code the processor section with less. That is to say, I excluded 8 bit by inspection. This is not solid, just from what I see.
IF this storage was 16 bit based, I would expect to see and even duplication within all the 16bit fields. What I see is heavily weighted to modulo 4. When the storage is arranged as four columns of 16bit, each column has a great probability of a duplication within the column and a great probability of not matching other columns.
My look into the TPU thus far suggests that it does not match the pattent I brought up earlier BUT does have a lot of similarities to it. All the storage sizes match and the general working matches. Once again I guess that the idea for this pattent came from this part. Perhaps it is a single processor version and that the idea for the pattent was to eliminate some of the latency of this part (?)
===========
Very interested in your views.
-Tom
Code:
* INITIALIZE THE MICROSTORE
A4BE 4F CLRA ; SET FOR LOADING THE MICROSTORE
A4BF C6 03 LDAB #$03 ;
A4C1 FD 16 F0 STD $16F0 ;
A4C4 CE A8 7C LDX #$A87C ; MICROSTORE IN ESIDE FLASH
A4C7 18 CE 14 00 LDY #$1400 ; MICROSTORE LOAD POINTER (ALT FUNCTION)
A4CB EC 00 LDD $00,X ; MOVE MICROSTORE FROM FLASH TO THE TPU
A4CD FD 16 E0 STD $16E0 ;
A4D0 EC 02 LDD $02,X ;
A4D2 FD 16 E2 STD $16E2 ;
A4D5 EC 04 LDD $04,X ;
A4D7 FD 16 E4 STD $16E4 ;
A4DA EC 06 LDD $06,X ;
A4DC FD 16 E6 STD $16E6 ;
A4DF 18 EF 00 STY $00,Y ; UPDATE MICROCODE WORD
A4E2 18 08 INY ; POINTER FOR NEXT MICROCODE
A4E4 18 08 INY ;
A4E6 C6 08 LDAB #$08 ; INCREMENT FLASH MICROCODE POINTER
A4E8 3A ABX ; TO NEXT 64BIT WORD
A4E9 8C B0 7C CPX #$B07C ; MICROSTORE LOAD COMPLETE
A4EC 25 DD BCS $A4CB ; LOOP THROUGH STORE
Bookmarks