How to create a Unity Rhythm Game Part 2: Generating the Steps

Part 1

In the last article we went over how to create a parser to dynamically convert a valid .sm file (with the exception of some edge cases, such as the song having multiple bpms). So we now have a fully populated data structure that looks something like this:

SM Data Structure Example

If that didn’t make much sense, then don’t worry too much about it! The previous article goes over the structure in a little more detail. Now let’s tackle actually using this data. The plan is to dynamically spawn the arrows at the correct time so that when they fall down the screen (at a speed determined by the user’s difficulty) they hit the detection zone in sync with the song.

This may sound like a difficult task, but we can break it down rather easily with a simple physics equation;

Speed = Distance / Time

We want to find the time in the song that we should spawn the arrow, we are given the speed through the difficulty the user has selected (The arrows/steps will move faster at a higher speed) and we pick a location off-screen as our fixed ‘spawn distance.’ So with this information in mind we can re-arrange the equation to

Time = Distance / Speed

 Now remember this equation because it becomes fundamental to the calculation later.

The first thing is to set up the actual spawning process. To do this we need to Initialize some of the variables we’ll need.

    public void InitSteps(Song_Parser.Metadata newSongData, 
                          Song_Parser.difficulties newDifficulty)
    {
        songData = newSongData;
        isInit = true;

        //We estimate how many seconds a single 'bar' will be in the song
        //Using the bpm provided in the song data
        barTime = (60.0f / songData.bpm) * 4.0f;
        
        difficulty = newDifficulty;
        distance = originalDistance;

        //We then use the provided difficulty to determine how fast the arrows 
        //will be going
        switch (difficulty)
        {
            case Song_Parser.difficulties.beginner:
                arrowSpeed = 0.007f;
                noteData = songData.beginner;
                break;
            case Song_Parser.difficulties.easy:
                arrowSpeed = 0.009f;
                noteData = songData.easy;
                break;
            case Song_Parser.difficulties.medium:
                arrowSpeed = 0.011f;
                noteData = songData.medium;
                break;
            case Song_Parser.difficulties.hard:
                arrowSpeed = 0.013f;
                noteData = songData.hard;
                break;
            case Song_Parser.difficulties.challenge:
                arrowSpeed = 0.016f;
                noteData = songData.challenge;
                break;
            default:
                goto case Song_Parser.difficulties.easy;
        }

        //This variable is needed when we look at changing the speed of the song
        //with the variable BPM mechanic
        originalArrowSpeed = arrowSpeed;
    }

 A lot of whats being done here seems fairly straight forward, we are initializing the variables we need and setting the speed of the arrows to match the chosen difficulty, with the exception of one line.

barTime = (60.0f / songData.bpm) * 4.0f;

 This line may seem a little weird, but what we’re doing is estimating how much time a bar will take in seconds given the song’s bpm. Bpm is recorded as ‘beats per minute’ meaning that if we had 120 bpm then we could represent that as

120 Beats = 60 seconds

So lets take that logic and make it a little more generic

BPM = 60 Seconds

From this we can calculate the time taken for a single ‘beat’ by dividing the BPM by 60

Time for a single note = BPM / 60

And we know that in most cases, a bar will be 4 notes, now this can differ, but in these cases the notes are often closer together, meaning that the total time taken for the bar would be the same, despite the additional notes in it. If a bar is 4 notes we can modify the above equation to

Time for a Bar = 4 * Time for a single note

Time for a Bar = 4 * (BPM / 60)

In case it wasn’t obvious yet, there’s gunna be a bit of Math involved when calculating these values, I’ll go through it the best I can when it crops up however.

So now that initialization is done, let’s look at the meat of the generation process.

    // Update is called once per frame
    void Update () 
    {
        //If we're done initializing the rest of the world
        //And we havent gone through all the bars of the song yet
        if (isInit && barCount < noteData.bars.Count) 
        { 
            //We calculate the time offset using the s=d/t equation (t=d/s) 
            distance = originalDistance; 
            float timeOffset = distance / arrowSpeed;

            //Get the current time through the song 
            songTimer = heartAudio.time; 

            //If the current song time - the time Offset is greater than 
            //the time taken for all executed bars so far 
            //then it's time for us to spawn the next bar's notes 
            if (songTimer - timeOffset >= (barExecutedTime - barTime))
            {
                StartCoroutine(PlaceBar(noteData.bars[barCount++]));

                barExecutedTime += barTime;
            }
        }
    }

The above code is called through our update method, which is a function called once per frame by the Unity engine, we need to keep this in mind as it can affect how we structure our game logic.

We first check whether the game is still initializing or not, if it isn’t we check how many bars of notes we have ‘spawned’ so far, as long as it isn’t the end of the song, we go into the main body of the method.

We then calculate the intended time offset to spawn the arrows using the current arrow’s speed and the intended spawn distance, using the equation we showed before; t=d/s.

The last piece of information we need to keep track of is the current time progress of the song (ie, how far through the song we are in seconds). Which we get through the heartAudio variable, which contains the Unity AudioSource object we’re playing music from.

Now that we have all the info we need, we can decide whether to spawn the next bar of notes on this frame or not. In order to do that we check whether the current time of the song, minus the offset required to spawn the arrows on-time is larger than the time taken to go through all the executed bars so far, minus one ‘bar time.’ The logic behind this is a little funny, and it took some trial and error to get it working, so don’t be disheartened if it doesn’t make a lot of sense to you.

If this is true; we call a Unity-specific function called ‘StartCoroutine.’ StartCoroutine runs the function passed into it in a separate thread, allowing it to run parallel to the rest of the game. If we didnt have this then we’d need to wait until all the arrows were spawned before the update loop progressed, in some games this isn’t an issue, but in a rhythm game where timing is key and there’s constant attention by the user we want anything lengthy to be run parallel during the main game loop.

We pass in a function called PlaceBars and the value of barCount before incrementing it (adding 1 to it’s value), barCount allows us to know which bar in our data structure we need to place.

    IEnumerator PlaceBar(List<Song_Parser.Notes> bar)
    {
        for (int i = 0; i &lt; bar.Count; i++)
        {
            if (bar[i].left)
            {
                GameObject obj = (GameObject)Instantiate(leftArrow, new Vector3(leftArrowBack.transform.position.x, leftArrowBack.transform.position.y + distance, leftArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = leftArrowBack;
            }
            if (bar[i].down)
            {
                GameObject obj = (GameObject)Instantiate(downArrow, new Vector3(downArrowBack.transform.position.x, downArrowBack.transform.position.y + distance, downArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = downArrowBack;
            }
            if (bar[i].up)
            {
                GameObject obj = (GameObject)Instantiate(upArrow, new Vector3(upArrowBack.transform.position.x, upArrowBack.transform.position.y + distance, upArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = upArrowBack;
            }
            if (bar[i].right)
            {
                GameObject obj = (GameObject)Instantiate(rightArrow, new Vector3(rightArrowBack.transform.position.x, rightArrowBack.transform.position.y + distance, rightArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = rightArrowBack;
            }
            yield return new WaitForSeconds((barTime / bar.Count) - Time.deltaTime);
        }
    }

This PlaceBar method is called on a new thread, because we called this method through a Coroutine, it has to have an ‘IEnumerator’ return type. Because of this we can use Unity’s WaitForSeconds class to delay the thread for a certain period of time.

This method goes through all of the notes in that bar, and depending on which ‘step arrows’ are meant to be spawned, it creates an instance of that game object and initializes some stuff in their scripts.

Once spawning the correct note, the method and thread waits for some time with the line:

yield return new WaitForSeconds((barTime / bar.Count) - Time.deltaTime);

In a usual method, a return would end that method’s execution and return a value to whatever called it. With a method that returns IEnumerator however, we can use a ‘yield return’ to return multiple things. What yield return actually does is it returns a value, and then continues the execution of the method until it’s executed all the code in it and executes normally.

In this instance, we exploit this by returning a reference to the previously mentioned WaitForSeconds class. We get the method to wait for however long it takes to reach the time where the next note should appear. Otherwise, all the notes in the bar will be spawned at once, instead of separately when they should.

We use ‘barTime’ which contains the time taken for a single bar from our initial estimate, divide it by the number of notes in this bar (bar.Count) in order to find the time taken for each note in this bar. Finally, we subtract the current deltaTime. Delta Time is the difference in time between this frame and the last, putting this in prevents any delays caused by the instantiation process to prevent notes being spawned on time.

With the steps generated, we add to the total barExecutedTime seen previously, and get ready to spawn the next bar!

In you were after a starting point on making a Rhythm game then this will probably be enough for you, the next part of this writeup will be a quick one discussing how I allowed the user to generate a new BPM for a song by hitting the space bar to the rhythm wanted.

Room-Based Camera Systems & Implementation

Disclaimer: It has been brought to my attention by “doomedbunnies” on reddit that ‘free camera’ may not be the correct term that I’ve used, s/he noted that free cameras often refer to cameras with ‘no clip’ that are just directly controlled by a user. They proposed that the free-following cameras I discuss be referred to as a ‘2D Follow Camera’ or ‘2D Platformer Camera,’ I won;t be editing every reference to free cameras in this article, but it is something to keep in mind should the term confuse you or should you try to reference this article.

So in my previous post I briefly went over my new design for Project GhostLight, and since then I’ve been working on bits and pieces in the evenings and the weekends, and gotten the game to the point where nearly all the required mechanics are in place. One of the things I mentioned in my previous post was a room-based camera system.

And so in this article I want to go through that a little bit, explain what effect I was hoping to achieve, and show my thought process and the final result. I imagine this article will have a fair amount of code, though again I will be going through the code and hopefully explaining it to the point where even those who are unfamiliar with C#, Unity, or even programming in general will be able to get the gist of it.

Okay let’s start! As I may have mentioned, the camera system I wanted to achieve was heavily inspired by the camera’s in Mega man and Shovel Knight, I have already discussed briefly why I wanted a system like this, but just to re-iterate; A free camera system made it difficult for me to plan what the player could see at any given time, and so when it came to planning and designing levels I kept finding myself thinking “If only this was just a contained room with a fixed camera.”

But what is a room-based system? What is a free camera system? What’s the difference? Let me give a little bit of context for those who haven’t played the games already mentioned in this article. So a free camera system is the more common type of camera implementation, where the camera follows the player and moves freely on it’s own. This can be seen in loads of games, from Super Mario Bros. to Metal Gear Solid V.

A free camera system has a single camera that moves to keep the player in frameA free camera system has a single camera that moves to keep the player in frame

A room-based camera, on the other hand, is a style of camera usually reserved for 2D games, the concept is that the camera is locked to the constraints of a single ‘room’ or area, and when the player moves into another ‘room’ then the camera performs a transition animation, and the game continues from there.

Room-based CameraA room-based camera system snaps the camera to the room the player is in

And so based off of this desire, I thought of a few requirements for my system:

  1. Multiple rooms that the camera automatically snaps between when the player crosses that room’s boundaries, either vertically or horizontally.
  2. A small transition animation to give the player a chance to see the contents of the room.
  3. The potential for long panning rooms, either vertically or horizontally.

These simple requirements would be the basis of my camera implementation, which I’ll walk through below in pseudo-code and real-code, there are lots of ways to do this, but this was my choice, and I’ll try and validate it as I go.

My first task was to create a room boundary that the player could cross to trigger the transition and so the camera knows where to stick to, I did this by creating an empty object in Unity, and giving it a 2D Box Collider, this would act as a trigger and would be the size of the room. When the player intersects with a Trigger, unity automatically calls some methods, including “OnTriggerEnter2D,” from there we can trigger the transitions.

Let’s start with the OnTriggerEnter2D Method, the script is attached to the player;

void OnTriggerEnter2D(Collider2D collider)
{
    if (collider.tag == "Room" &amp;&amp; newRoom.go != collider.gameObject &amp;&amp; !isTransitioning)    //If we're triggering a room, the room isn't the one we're in or transitioning to, and we aren't current transitioning
    {
        //Hold the player's velocity before the transition starts
        tempVelocity = player.GetComponent().velocity;

        //Disable the player from moving
        DisablePlayer();

        //Get the new room
        newRoom.go = collider.gameObject;   //Grab the object instance for the room
        newRoom.bl = new Vector2(Mathf.Round((collider.transform.position.x - (collider.bounds.size.x / 2)) * 2) / 2,
                                    Mathf.Round((collider.transform.position.y - (collider.bounds.size.y / 2)) * 2) / 2);  //Get the bottom left of the room, using the center of the room as the origin
        newRoom.tr = new Vector2(Mathf.Round((collider.transform.position.x + (collider.bounds.size.x / 2)) * 2) / 2,
                                    Mathf.Round((collider.transform.position.y + (collider.bounds.size.y / 2)) * 2) / 2);  //Get the top right of the room, using the center of the room as the origin

        //Create a local copy of the player's position at the start of the transition
        Vector3 playerPos = player.transform.position;

        //Find whether the transition will be horizontal or vertical
        if (newRoom.bl.x &gt;= currentRoom.tr.x) //If the new room's bottom left is further right than our current top right
        {
            currentTransitionDir = transitionDirection.right; //Then we must be transitioning right
        }
        else if (newRoom.tr.x &lt;= currentRoom.bl.x) //Otherwise if the new room's top right is further left than our current bottom left { currentTransitionDir = transitionDirection.left; //Transitioning Left } else if (newRoom.bl.y &gt;= currentRoom.tr.y) //If the new room's bottom left is further up (higher y value) than the current room's top right
        {
            currentTransitionDir = transitionDirection.up; //Transitioning Up
        }
        else if (newRoom.tr.y &lt;= currentRoom.bl.y) //If the new room's top right is lower than the current rooms bottom left
        {
            currentTransitionDir = transitionDirection.down; //Transitioning Down
        }
        else
        {
            currentTransitionDir = transitionDirection.none; //Placeholder for if something goes wrong
        }

        targetPos = ClampCamera(newRoom, cam.transform.position); //Clamp the camera's position to the bounds of the new room
        isTransitioning = true; //Set it so we are transitioning
    }
}

And so with the initial trigger completed, isTransitioning is now set to true, and the target position for the camera is now set. With this in mind we can then begin to move the camera & player, causing the transition animation that was in our requirements. In order to do this we’ll use the Mathf.Lerp command, which returns a number between two points, the ‘distance’ between the points id determined by a third parameter, in this case we’ll use the delta time, to provide a smooth transition. As we need this to run each frame (so the change in delta time can smoothly move the object) we’ll incorporate it into the update method, which is a method called by Unity each game ‘tick.’

// Update is called once per frame
void LateUpdate()
{
    camBounds = new Vector2(cam.nativeResolutionWidth / pixelPerMeter, cam.nativeResolutionHeight / pixelPerMeter); //Save the current camera bounds so even if the screen is re-sized the transition remains consistent

    if (isTransitioning) //If we're transitioning
    {
        //Call the transition camera method
        TransitionCamera();
        //Call the transition player method
        TransitionPlayer();
        //If it's a downward transition, we'll also need to consider other objects in the scene that may move with the player, such as falling platforms
        if (currentTransitionDir == transitionDirection.down)
        {
            //Call the transition falling platforms method
            TransitionFallingPlatforms();
        }
    }
    else //If we're not transitioning
    {
        //We make sure the falling platforms are re-activated
        ReActivateFallingPlatforms();
        //And we pan the camera within the bounds of the current room
        PanCamera();
    }
}

We can use this to our advantage by having it so when we are not transitioning, we allow the camera to free-move within the bounds of the current room. This satisfies the requirement of long panning rooms as well! We clamp the camera through the use of the Mathf.Clamp command, which restricts a value and makes sure it’s between an upper and lower bound, we have a ClampCamera method that does a lot of it for us.

void PanCamera()
{
    Vector3 newCamPos = new Vector3(player.transform.position.x, cam.transform.position.y, cam.transform.position.z); //Set the cameras center to the player
    cam.transform.position = ClampCamera(newRoom, newCamPos); //Re-clamp it to the bounds of the room, so that it follows the player, but doesnt leave the current room
}

So with the panning done, lets get down to the meat of the matter; the transition animation. This is split into two parts, the Camera, and the Player. Technically there is also the falling platforms (if they are close to the player) but for simplicity’s sake we wont include that here.
Let’s start with the camera, we simply move it’s position using the previously mentioned Mathf.Lerp command between it’s current position and the target position. We also have to include a ‘minimum lerp’ distance, otherwise the transition will go on for a lot longer than needed moving a minuscule amount.

void TransitionCamera()
{
    //Create a local copy of the target position (as the new camera position)
    Vector3 newCamPos = targetPos; 

    //Lerp camera between two positions
    if (Mathf.Abs(targetPos.x - cam.transform.position.x) &gt; minLerp)
    {
        //If the x has changed, then move along the x
        newCamPos.x = Mathf.Lerp(cam.transform.position.x, targetPos.x, Time.deltaTime * transitionSpeed);
    }
    else if (Mathf.Abs(targetPos.y - cam.transform.position.y) &gt; minLerp)
    {
        //If the y has changed, then move along the y
        newCamPos.y = Mathf.Lerp(cam.transform.position.y, targetPos.y, Time.deltaTime * transitionSpeed);
    }
    else
    {
        //Both x and y are equal so the camera has stopped moving
        newCamPos = targetPos;
        isTransitioning = false;

        //Enable Player again once the camera is in position
        EnablePlayer();

        //Make the current room equal the new room
        currentRoom.bl = newRoom.bl;
        currentRoom.tr = newRoom.tr;
        currentRoom.go = newRoom.go;
    }
    //Set the camera's z axis position to what it was before
    newCamPos.z = cam.transform.position.z;

    //Finally move the camera
    cam.transform.position = newCamPos;
}

The player’s transition follows similar logic, however since the player retains it’s position at the transition entry point, it has to have some conditional statement for the direction of transition, so it can move the player in the right direction for them to be in the correct room.

void TransitionPlayer()
{
    Vector3 playerPos = player.transform.position;

    //Lerp player between the four directions
    if (currentTransitionDir == transitionDirection.right)
    {
        //If the player has reached the target position (including a horizontal offset) then we can finish the transition
        if (Mathf.Abs(targetPos.x + horizontalPlayerBuffer - playerPos.x - (camBounds.x / 2)) &gt; minLerp)
        {
            //Right transition
            playerPos.x = Mathf.Lerp(playerPos.x, targetPos.x + horizontalPlayerBuffer - (camBounds.x / 2), Time.deltaTime * transitionSpeed);
        }
        else
        {
            //Right Lerp finished
            playerPos.x = targetPos.x + horizontalPlayerBuffer - (camBounds.x / 2);
        }
    }
    else if (currentTransitionDir == transitionDirection.left)
    {
        if (Mathf.Abs(((targetPos.x + (camBounds.x / 2)) - horizontalPlayerBuffer) - player.transform.position.x) &gt; minLerp)
        {
            //Left transition
            playerPos.x = Mathf.Lerp(playerPos.x, (targetPos.x + (camBounds.x / 2)) - horizontalPlayerBuffer, Time.deltaTime * transitionSpeed);
        }
        else
        {
            //Left Lerp finished
            playerPos.x = (targetPos.x + (camBounds.x / 2)) - horizontalPlayerBuffer;
        }
    }
    else if (currentTransitionDir == transitionDirection.up)
    {
        if (Mathf.Abs((targetPos.y + verticalPlayerBuffer) - playerPos.y - (camBounds.y / 2)) &gt; minLerp)
        {
            //Up Transition
            playerPos.y = Mathf.Lerp(playerPos.y, targetPos.y + verticalPlayerBuffer - (camBounds.y / 2), Time.deltaTime * transitionSpeed);
        }
        else
        {
            //Up Lerp finished
            playerPos.y = targetPos.y + verticalPlayerBuffer - (camBounds.y / 2);
        }
    }
    else if (currentTransitionDir == transitionDirection.down)
    {
        //The player has to be moved a little further due to the space the GUI takes up
        if (Mathf.Abs(((targetPos.y + (camBounds.y / 2)) - (verticalPlayerBuffer + guiBuffer)) - playerPos.y) &gt; minLerp)
        {
            //Down Transition
            playerPos.y = Mathf.Lerp(playerPos.y, (targetPos.y + (camBounds.y / 2)) - (verticalPlayerBuffer + guiBuffer), Time.deltaTime * transitionSpeed);
        }
        else
        {
            //Down Lerp finished
            playerPos.y = (targetPos.y + (camBounds.y / 2)) - (verticalPlayerBuffer + guiBuffer);
        }
    }

    //Finally move the player
    player.transform.position = playerPos;
}

Hopefully this makes sense, this is just my implementation of the proposed system however, and there are many alternates, each implementation comes with it’s own issues, and it’s always a good habit to point out the issues in your work, if there’s anything I learnt from my last article, it’s that self evaluation is pretty important, and so let’s go over a few issues I’ve thought of with my implementation, in hopes that anyone learning from this can do it better.

The first thing is that the transition only works for the vertical and horizontal direction, and so diagonal transitions are out. The camera is also unable to pan vertically in it’s current state, although I expect with a bit of shenanigans I could find a way to work around it. The only other issue with the current system worth mentioning is that if there is a wall or obstacle in the way of the player’s target position during a transition, the player will be forced inside the object.

Thankfully a lot of these obstacles are avoidable, we dont place walls where the player is meant to transition, and we dont place diagonal transitions, simple!
Anyway, I think I’ve rambled on long enough, hopefully this will help some poor sod out there looking to imitate Mega-Man or Shovel Knight, like I was!

Thanks for reading!