субота, 24. новембар 2018.

Hardware sprites on the FPGA computer

Adding hardware sprites

This is a follow-up of the FPGA computer post. 

I have added hardware sprites to the graphic mode of my FPGA computer. It now supports up to 16 sprites, each one being 16x16pixels in size. Here is how it looks on the monitor:
In emulator, it looks the same:



Each sprite is defined by the 8-byte structure:
  • sprite definition data address (2 bytes)
  • x coordinate (2 bytes)
  • y coordinate (2 bytes)
  • transparent color (2 bytes).
The sprite structure for the first sprite starts at address of 56 decimal. Each next sprite structure starts 8 bytes later. 

Sprite definition data consists of 16 lines, each line described by 16 pixels, each pixel defined by 4 bits: xrgbThis means that one sprite line consists of 8 bytes (two pixels per byte), so total bytes needed for the sprite definition is 8x16 bytes == 128 bytes.

Here is the example of showing one sprite at (25, 25) in assembler language:

  mov r0, sprite_def
  mov r1, 56
  st [r1], r0  ; sprite definition is at sprite_def address
  mov r0, 25
  st [r1 + 2], r0  ; x = 25  at addr 58
  mov r0, 25
  st [r1 + 4], r0  ; y = 25  at addr 60
  mov r0, 0
  st [r1 + 6], r0  ; transparent color is black (0) at addr 62
  ; sprite definition
sprite_def:
  #d16 0x0000, 0x0000, 0x0000, 0x0000  ; 0
  #d16 0x0000, 0x000f, 0xf000, 0x0000  ; 1
  #d16 0x0000, 0x000f, 0xf000, 0x0000  ; 2
  #d16 0x0000, 0x000f, 0xf000, 0x0000  ; 3
  #d16 0x0000, 0x004f, 0xf400, 0x0000  ; 4
  #d16 0x0000, 0x004f, 0xf400, 0x0000  ; 5
  #d16 0x0000, 0x044f, 0xf440, 0x0000  ; 6
  #d16 0x0000, 0x444f, 0xf444, 0x0000  ; 7
  #d16 0x0004, 0x444f, 0xf444, 0x4000  ; 8
  #d16 0x0044, 0x444f, 0xf444, 0x4400  ; 9
  #d16 0x0400, 0x004f, 0xf400, 0x0040  ; 10
  #d16 0x0000, 0x004f, 0xf400, 0x0000  ; 11
  #d16 0x0000, 0x004f, 0xf400, 0x0000  ; 12
  #d16 0x0000, 0x041f, 0xf140, 0x0000  ; 13
  #d16 0x0000, 0x4111, 0x1114, 0x0000  ; 14
  #d16 0x0004, 0x4444, 0x4444, 0x4000  ; 15

How this stuff works? First of all, I had to decide how to implement sprites. I have decided to fetch all sprite data during the vertical blanking interval (VBI). During VBI, the video subsystem starts fetching sprite data by reading the 8-byte sprite structure starting from the address of 56 decimal (the address and data bus are 16-bit, so the computer is word-oriented (reads two bytes at the same time), and the actual address is set to
56 >> 1 == 28):

if ((x >= 640) && (y == 479) && (state == IN_LINE)) begin
  // when we start the vertical blanking, 
  // we need to fetch in advance the first sprite data
  state <= READ_SPRITES;
  sprite_counter <= 4'b0;
  rd <= 1'b1;
  wr <= 1'b0;
  mem_read <= 1'b1;
  addr <= 16'd28;    // prepare to read sprite definition address
end

In the next clock cycle, the system is in the READ_SPRITES state. The first thing that we do in the READ_SPRITES state is fetching the sprite definition address which is present at the data bus, since we have initiated a memory read from within the previous state.

Then we need to prepare the address bus for the next state in which we will fetch the x coordinate of the sprite. We do that by setting the address bus to (58 + (sprite_counter << 3)) for all sprites, having the sprite_counter iterating from 0 to 15:

READ_SPRITES: begin
  sprite_addr[sprite_counter] <= data;
  state <= READ_SPRITE_X;
  rd <= 1'b1;
  wr <= 1'b0;
  mem_read <= 1'b1;
  // prepare to read x coordinate of the sprite
  addr <= (16'd58 + (sprite_counter << 3)) >> 1;    
end

In the READ_SPRITE_X state, we fetch the x coordinate of the sprite which was ready at the data bus, and then we prepare to read the y coordinate in the next state:

READ_SPRITE_Y: begin
  sprite_y[sprite_counter] <= data;
  state <= READ_SPRITE_TRANSPARENT_COLOR;
  rd <= 1'b1;
  wr <= 1'b0;
  mem_read <= 1'b1;
  // prepare to read transparent color of the sprite  
  addr <= (16'd62 + (sprite_counter << 3)) >> 1;    
end

In the READ_SPRITE_Y state, we fetch the y coordinate of the sprite which was ready at the data bus, and then we prepare to read the sprite transparent color in the next state:

READ_SPRITE_TRANSPARENT_COLOR: begin
  sprite_transparent_color[sprite_counter] <= data[3:0];
  state <= READ_SPRITE_DATA;
  rd <= 1'b1;
  wr <= 1'b0;
  mem_read <= 1'b1;
  line_counter <= 16'b0;
  word_counter <= 4'b0;
  // read sprite definition bytes
  addr <= sprite_addr[sprite_counter] >> 1;    
end

In the READ_SPRITE_TRANSPARENT_COLOR state, we fetch the transparent color of the sprite, and then put the address of the sprite definition to the address bus so we can fetch it in the next state:

READ_SPRITE_DATA: begin
  if (line_counter < 16) begin
    case (word_counter) 
    0:  sprite_pixels[sprite_counter][line_counter][63:48] <= data;
    1:  sprite_pixels[sprite_counter][line_counter][47:32] <= data;
    2:  sprite_pixels[sprite_counter][line_counter][31:16] <= data;
    3:  sprite_pixels[sprite_counter][line_counter][15:0]  <= data;
    endcase
    state <= READ_SPRITE_DATA;
    rd <= 1'b1;
    wr <= 1'b0;
    mem_read <= 1'b1;
    if (word_counter < 3) begin
      word_counter = word_counter + 1'b1;
    end
    else begin
      word_counter = 1'b0;
      line_counter = line_counter + 16'b1;
    end
    // read sprite definition bytes
    addr = (sprite_addr[sprite_counter] + ((word_counter +
           (line_counter << 2)) << 1) ) >> 1;    
  end
  else 
  begin
    if (sprite_counter < SPRITE_NUM) begin
      sprite_counter = sprite_counter + 1'b1;
      state <= READ_SPRITES;
      rd <= 1'b1;
      wr <= 1'b0;
      mem_read <= 1'b1;
      // read next sprite definition address
      addr <= (16'd56 + (sprite_counter << 3)) >> 1;   
    end
    else begin
      sprite_counter <= 4'b0;
      rd <= 1'b1;
      wr <= 1'b0;
      mem_read <= 1'b1;
      addr <= VIDEO_MEM_ADDR + 0;
      state <= V_BLANK;
    end
  end
end

In the READ_SPRITE_DATA state we start reading sprite definition from the memory. We do it for each line of the sprite (16 lines per sprite), and within the line, for each word containing four pixels of the sprite line definition.

When we finish loading all sprite definition data for the current sprite, then we do the same for other sprite until we read all sprite definition data. Then we then set the address bus to load the pixel data at the (0, 0) position on the screen, and move to the V_BLANK state:

V_BLANK: begin
  pixels <= data;
  state <= SCAN_IDLE;
  rd <= 1'bz;
  wr <= 1'bz;
  mem_read <= 1'b0;
end

In the V_BLANK state we read the pixels of the frame buffer at the (0, 0) coordinate, and then set the all the control signals to high impedance and set the state to SCAN_IDLE. We will leave the SCAN_IDLE state when the the time comes to start displaying pixels starting from the (0, 0) coordinate.

Displaying sprite data

During the scanline processing, we need to display both original pixels from the frame buffer as well as the sprite data, and we need to make sure that the original pixels must be displayed through the transparent sprite color.

This is done in the following code:

if (valid) begin
  for (i = 0; i < SPRITE_NUM; i = i+1) begin
    if ((sprite_addr[i] != 16'b0) &&
       (xx >= sprite_x[i]) &&
       (xx < (sprite_x[i] + 16)) &&
       (yy >= sprite_y[i]) &&
       (yy < (sprite_y[i] + 16))) begin

      sprite_found = 1'b1;
      if (
        sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 0] != sprite_transparent_color[i][0] ||
        sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 1] != sprite_transparent_color[i][1] ||
        sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 2] != sprite_transparent_color[i][2]
      ) begin
        r <= sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 0] == 1'b1;
        g <= sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 1] == 1'b1;
        b <= sprite_pixels[i][yy - sprite_y[i]][60-(((xx - sprite_x[i]) << 2) ) + 2] == 1'b1;
      end 
      else begin
        r <= pixels[12 - ((xx & 3) << 2) + 0] == 1'b1;
        g <= pixels[12 - ((xx & 3) << 2) + 1] == 1'b1;
        b <= pixels[12 - ((xx & 3) << 2) + 2] == 1'b1;
      end
    end 
  end
  if (!sprite_found) begin
    r <= pixels[12 - ((xx & 3) << 2) + 0] == 1'b1;
    g <= pixels[12 - ((xx & 3) << 2) + 1] == 1'b1;
    b <= pixels[12 - ((xx & 3) << 2) + 2] == 1'b1;
  end
  else begin
    sprite_found = 1'b0;
  end
end
else begin
  // blanking -> no pixels
  r <= 1'b0;
  g <= 1'b0;
  b <= 1'b0;
end
end

The most interesting thing is the "for loop". It is not a loop - it actually repeats the Verilog code SPRITE_NUM times. That is the most important thing to understand about "loops". You don't have the linear code to be executed multiple times. Instead, everything is a giant state machine that pulses with the clock signals and the "for loop" just unwraps the code multiple times, and all that unwrapped code "works" at the same time.

So, when we have this Verilog code:
 for (i = 0; i < SPRITE_NUM; i = i+1) begin
    if ((sprite_addr[i] != 16'b0) &&
       (xx >= sprite_x[i]) &&
       (xx < (sprite_x[i] + 16)) &&
       (yy >= sprite_y[i]) &&
       (yy < (sprite_y[i] + 16))) begin

It actually does this:
    if ((sprite_addr[0] != 16'b0) &&
       (xx >= sprite_x[0]) &&
       (xx < (sprite_x[0] + 16)) &&
       (yy >= sprite_y[0]) &&
       (yy < (sprite_y[0] + 16))) begin

...
    end
    if ((sprite_addr[1] != 16'b0) &&
       (xx >= sprite_x[1]) && 
       (xx < (sprite_x[1] + 16)) && 
       (yy >= sprite_y[1]) && 
       (yy < (sprite_y[1] + 16))) begin
...
    end
...


The code with the "for loop" does the same thing for all sprites:
  1. if the spite definition address is not zero, and current x and y coordinates of the scanline are within sprite coordinates, then we put the current sprite pixel color to the output r, g and b signals, or we put the original frame buffer pixel colors, if the current sprite pixel is transparent one (the color of the current sprite pixel is the transparent color).
  2. else, if the current x and y coordinates of the scanline are outside of the sprite coordinates, we put the frame buffer pixel data to the r, g and b output signals.
  3. else, it must be blanking interval, so put zeros to r, g and b to output signals.

Conclusion

This implementation of sprites requires that the vga module has its own internal memory which is filled with the sprite data from the main memory. Then, during the scanline processing, sprite pixels are combined with frame buffer pixels in a way that sprite pixels are placed "over" the frame buffer pixels, unless the current sprite pixel is the transparent one. If that is the case, then the frame buffer pixel is "shown" through the sprite.

The great thing about hardware sprites is that they do not consume processor time at all. Everything is done in hardware and showing sprites actually requires just to have the sprite definition address set to non-zero value.

уторак, 6. новембар 2018.

MozillaThunderbird still working when you close the program

How to fix the problem when Mozilla Thunderbird remains in memory after closing

I use Thunderbird for more than ten years. I simply cannot migrate to anything else. Unfortunately, Thunderbird has one annoying feature (or a bug): it stays in memory when you close the program using the X button at the top right corner. Why is that bad? Well, if you try to close the Thunderbird that way, and then you start it again, it will start behaving oddly. It will either stop recieving mails, or get stuck when sending an email. It will simply work bad.

To remedy this, it is recommended that you close the Thunderbird by using the Exit option from the File menu. However, I got used to close the application by clicking on the X button so much, that it is annoying for me to change my habits and do some extra clicks (instead of a single one).

Even worse, some people claim that the Thunderbird remains active even if you go to the File menu and choose the Exit option. I am almost sure that it happend with my Thunderbird too, couple of times.

So, I have decided to do something with that problem, and today I have created a batch file with the following content:

taskkill /IM thunderbird.exe /F
start "Thunderbird" "c:\Program Files (x86)\Mozilla Thunderbird\thunderbird.exe"

Instead of clicking on the Thunderbird icon, I click on the shortcut on my desktop which points to this batch file. 

What this batch file does? It kills the remaining Thunderbird instance before starting a new one. The "/F" switch forces the kill (just like the "kill -9" in Linux), while "/IM thunderbird.exe" is the image name of the process to be killed.

Make desktop shortcut to a batch file work on Windows 10

This brings us to the second problem: on Windows 10, if you create a shortcut on the Desktop which points to a batch file, it doesn't work (if you double click on that shortcut, nothing happens).

You need to edit the newly created shortcut and add the following text in the Target field, just before the original command:

C:\Windows\System32\cmd.exe /c "original command"

For example, my batch file is in the C:\Tools folder. This means that the original shortcut had the content of the Target field like this:

C:\Tools\thunderbird.bat

Now, the modified Target looks like this:

C:\Windows\System32\cmd.exe /c "C:\Tools\thunderbird.bat"


Pay attention that you need to put the original command in double quotes.