qSoC – The QMEM bus

QMEM bus specification

This post describes the QMEM bus, the different cycles allowed, the bus elements and different bus configurations supported.

1. Introduction
2. Features
3. Signals Description
4. Cycles Description
5. Bus Elements
6. Bus Configurations

Introduction

QMEM (abbreviated from quick memory), is a flexible, portable, simple and fast system interconnect bus, specifically targeted at SoC systems for their inter-chip communication needs.

QMEM is based on synchronous memory bus with added flow control signals, which makes it very simple and fast. The origin of QMEM is the OR1200 open-source CPU implementation, where it was used as a tightly-coupled memory (TCM) bus inside the CPU.

Features

  • flexible: endian-independent, supports different data and address widths, flexible access speeds, flexible interconnect methods like point-to-point, shared bus, multi-layered interconnect
  • portable: fully vendor-, tool-, language- and technology independent
  • simple: based on synchronous memory bus with added flow control, it is the simplest bus with minimal bus interconnect logic
  • fast: fully pipelined, single cycle reads and writes, with no setup or end cycles
  • extensible: allows any number of transfer tags added to support different features, like master identification, slave error reporting, etc

Signals description

  • cs – master output signal denoting valid master cycle when cs=’1′, and idle cycle when cs=’0′
  • we – master read/write select signal, denotes write cycle when we=’1′, and read cycle when we=’0′
  • sel – master byte select signal, one bit for each byte in data words, generally only taken into account during write cycles and ignored during read cycles, sel=’1111′ denotes four bytes, sel=’0011′ denotes lower two bytes, sel=’0100′ denotes single byte
  • adr – master address signal
  • dat_w – master write data
  • dat_r – slave read data
  • ack – slave cycle acknowledge, asserted and valid when master cs=’1′

All signals are active-high. Optionally, clock (clk) and reset (rst) signals can be considered part of the QMEM bus, especially if the bus uses different clock domains that the rest of master or slave logic. Other common optional signals are slave error response (err), and the master id signal (mid).

The master can start the cycle at any time (synchronously to the clock), by asserting cs=’1′, and set any other signals as appropriate. The cycle ends with the slave acknowledge (ack=’1′). After the slave acknowledges the cycle, the master is free to start a new cycle immediately, by keeping cs=’1′, or going to an idle state by asserting cs=’0′.

Cycles description

Reset condition

In reset state (rst=’1′), all QMEM bus signals should be ignored, and their state can be undefined. The first cycle out of reset should be initialized as an IDLE cycle.

IDLE cycle

IDLE cycle is denoted when master cs=’0′ and slave ack=’0′. There is no activity on the bus, other than the possible slave read data (dat_r), if the previous cycle was a read cycle.

QMEM reset state & idle cycle

QMEM reset state & idle cycle

WRITE cycles

A WRITE cycle is denoted when master asserts cs=’1′, we=’1′, and puts the desired address on adr, the byte-select on sel and data to be written on dat_w. The master must not change any of its signals, or stop the cycle by asserting cs=’0′, without receiving the slave acknowledge (ack=’1′) first. The slave can insert any number of delay cycles by holding ack=’0′ while the master asserts cs=’1′, until it is ready to service masters’ request. The master can start a new cycle immediately after synchronously detecting slave acknowledge response (ack=’1′).

QMEM single write cycle with no delay

QMEM single write cycle with no delay

QMEM single write cycle with 1 cycle delay

QMEM single write cycle with 1 cycle delay

QMEM multiple write cycles with no delay

QMEM multiple write cycles with no delay

QMEM multiple write cycles with 1 cycle delay

QMEM multiple write cycles with 1 cycle delay

READ cycles

A READ cycle is denoted when master asserts cs=’1′, we=’0′, and puts the desired address on adr, and the byte-select on sel . The master must not change any of its signals, or stop the cycle by asserting cs=’0′, without receiving the slave acknowledge (ack=’1′) first. The slave can insert any number of delay cycles by holding ack=’0′ while the master asserts cs=’1′, until it is ready to service masters’ request. The master can start a new cycle immediately after synchronously detecting slave acknowledge response (ack=’1′).

QMEM single read cycle with no delay

QMEM single read cycle with no delay

QMEM single read cycle with 1 cycle delay

QMEM single read cycle with 1 cycle delay

QMEM multiple read cycles with no delay

QMEM multiple read cycles with no delay

QMEM multiple read cycles with 1 cycle delay

QMEM multiple read cycles with 1 cycle delay

MIXED cycles

A QMEM master is free to mix READ, WRITE and IDLE cycles any way it chooses. The slave must be ready to respond to a master WRITE cycle, even if it is in the same clock period as the previous cycle’s master read request, since reads are pipelined.

QMEM mixed cycles with no delay

QMEM mixed cycles with no delay

QMEM mixed cycles with 1 cycle delay

QMEM mixed cycles with 1 cycle delay

ERROR cycle

The QMEM bus has an optional err signal. The slave must keep this signal tied to ground (err=’0′), unless it wishes to communicate an error condition to the master. The slave can do that by asserting err=’1′ at the same time it is acknowledging the cycle with ack=’1′. Usually, the error signal is high if the slave is in reset. Another case where a slave might raise the error condition, is if the master is trying to address a memory or register that is bigger than the size of the slave memory.

QMEM error cycle

QMEM error cycle

QMEM bus elements

QMEM bus has four major bus components: masters, slaves, arbiters and decoders.

QMEM master

A QMEM master is a master device on the bus. It can start cycles, set bus direction and number of bytes affected, sets data to be written or reads data from slaves.

An example of a QMEM master (non-synthesizeable) can be seen here: qmem_master.v

QMEM slave

A QMEM slave responds to master cycles, either writing the master data to its memory or registers, or reading its memory or registers and sending them to the master.

An example of a QMEM slave (non-synthesizeable) can be seen here: qmem_slave.v

QMEM arbiter

An arbiter is a bus element that decides which master has access to a slave device in a given cycle. Each QMEM slave that is accessed by multiple masters must have an arbiter. An arbiter can grant masters access to the slave on priority-basis, it can use a round-robin scheme, or a combination of the two.

An example of a priority-based QMEM arbiter (synthesizeable) can be seen here: qmem_arbiter.v

QMEM decoder

An decoder is a bus element that directs master requests to an appropriate slave, based on a slave decoding scheme, which is usually address-based. Each QMEM master that accesses multiple slaves must have a decoder. This only applies to immediate connections, so in a shared bus configuration where the masters connect to a single arbiter (= single slave), there is no need for a decoder on the master’s side (the slave side of the arbiter in this bus configuration could still have a slave decoder attached, if there are multiple slaves).

An example of a QMEM decoder (synthesizeable) can be seen here: qmem_decoder.v

There are other elements that can be attached to a QMEM bus, like a bus register stage, which can register master, slave, or both side of the bus, a bus monitor that validates bus signals according to the rules, and bus converters, which can convert form and to QMEM bus from other bus architectures, like APB, AHB and Wishbone. These still need to be written, so I’ll write a

QMEM bus configurations

A QMEM bus can be built in multiple different configurations, depending on the speed or logic utilization needs. Some of the common configurations are: shared bus, point-to-point and multilayer.

In the following graphs, the mX represents masters, aX aribters, dX decoders and sX slaves.

QMEM shared bus

QMEM shared bus

QMEM point-to-point

QMEM point-to-point bus

QMEM multilayer bus

QMEM multilayer bus

8 thoughts on “qSoC – The QMEM bus

  1. zen0 says:

    Hey Chaos, very nice articles. I love people like you or Harbaum that share information and knowledge. I guess that one of the major reasons of the failure of the Replay board stays in his closed source. The community around MIST is impressive 🙂

    I started playing with a VIC-20 but I always liked to be more proficient with the hardware. I come to your site after browsing the MIST website. I really liked the work you guys have done it.

    Your idea to build a SoC only using an FPGA is awesome. I would like to follow your articles trying to replicate your experiments. Can I ask you what can be a good starting point in terms of FPGA Kit? Is the MIST suitable (it seems, according to your list of supported boards)? I also own a MiniSpartan6+ bought via Kickstarter (I would liked to play with HDMI protection and overlay).

    Thanks again for sharing information.

    • chaos says:

      Hi!

      Any FPGA board with reasonable logic element count, some memory and flash, possibly also an SD card socket and some free IO ports will be just fine. Your MiniSpartan6+ looks good to me. My favorite board at the moment is the DE1-SoC – cheap, powerful, with lots of stuff 🙂 But the MiST is also a nice board, and I will try to port all of my projects to it.

      I try to write most of the RTL code so that it is vendor-neutral, but there will always be some little pieces that need to be converted for each FPGA vendor, like PLLs, memory blocks, and of course the top-level ports. If you will need any help with that, just ask, and I will try to help.

      One of the first projects will be an audio player, reading WAV files from SD card and playing them. This is mostly so I can fix the sigma-delta audio DAC implementation used in minimig 😉

  2. zen0 says:

    Hi, thanks for support. Maybe the DE1 would be better than the MiST to learn HDL (do you need external programmer to use MiST? I mean JTAG for ARM and FPGA.)
    I noticed on Terasic website that there are 2 SoC board: the DE1-SoC and the SoCKit.
    What are the difference between the two and what do you suggest to use?

    • chaos says:

      Hi,

      The DE1 (and the DE1-SoC) is a more general purpose board, but you could use the MiST board just as well, if you already have it. You will need an external USB Blaster (JTAG interface) if you plan to use it for debugging, but you don’t need it just for uploading cores. A separate JTAG interface would be needed for ARM debugging on the Mist board. Both of the DE1 boards have onboard JTAG-USB interfaces.

      The DE1 is an older board with Cyclone II FPGA on it, and the DE1-SoC is much newer, with latest Cyclone IV on it, which also contains a double core ARM CPU on it, which IMO makes it much more flexible.

      If you don’t already have any of them, I’d suggest to get the newer DE1-SoC board.

      • zen0 says:

        Hi chaos, no I don’t have any boards. I found your site when I was looking for information about the MiST. I really like the MiST (mostly for the huge number of cores available 😀 ) but I think that it will become my second board ^^ maybe in few months.
        About the previous post, I’m referring to these 2 boards:
        SoCKit – the Development Kit for New SoC Device (http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=165&No=816)
        and the DE1-SoC that you mentioned.
        However, I’m oriented to order the De1-SoC also because Digi-Key not sell the SoCKit.

        • chaos says:

          Hi,

          the SocKIT and the DE1-SoC are pretty similiar, both have the Cyclone5-SoC FPGA variant on them, maybe the SocKIT has a slightly bigger FPGA.

          The SocKIT has 2xDDR memories, while the DE1-SoC has more user-friendly SDRAM on FPGA side and DDR on HPS side. Overall, I’d say the DE1-SoC is more user-friendly with its standard DEx GPIO ports, PS2 connector, ADC inputs, etc.

          Both are great boards, but IMO the DE1-SoC seems more appropriate for my needs.

          • zen0 says:

            Hey chaos, thanks for the reply. In the meanwhile I got my DE1-SoC ^^
            Now I’m reading the documentation and installing the software. With the Minispartan6+ I started learning VHDL (I’m really a newbie) while the Altera examples/docs seem more Verilog oriented. What do you think about?
            The code written for the DE1 with Cyclone IV can be adapted for the Cyclone V ?

          • chaos says:

            Hi,

            sorry for the late reply – I didn’t get any notification.

            VHDL and Verilog are both OK, I guess it comes down to personal preferences. I’m more of a Verilog guy myself 🙂

            And yes, you can reuse code for different FPGAs, even between vendors (Xilinx/Altera), as long as you don’t use vendor-specific blocks. There are certain things that *will* have to be changed, like PLL instances etc.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.