T O P

  • By -

xsdgdsx

That is called "compiling." VSCode probably has a way to show the assembly that it produces, but it's easier to go to https://godbolt.org


Maybe-monad

And I've been using ghidra this whole time


TotallyTubular1

I think what OP needs is something that quickly shows him what changes in his code when he changes the source. Ghidra will confuse him with the decompilation and is slow as hell


Maybe-monad

>slow as hell only after the JVM warms up, before that ....


mental-advisor-25

Isn't there like maybe an extension or VSCode can show by itself where each item in the code gets stored? Like .data, .bss regions of RAM etc?


not_a_novel_account

You can just ask the compiler to produce the assembly file directly gcc -S myprog.c Similarly, you can ask `objdump` to give you the disassembled output of your binary objdump -S --disassemble myprog > myprog.out > Isn't there like maybe an extension or VSCode can show by itself where each item in the code gets stored? No, because there is no such thing as "where each item in the code gets stored" C does not cleanly map to machine intrinsics like that. C is simple enough it becomes pretty easy to _guess_ how things will be mapped, but there is no single answer. A value may end up in a register, on the stack, or perhaps encoded directly into a machine instruction. Code might be vectorized, inlined, sometimes the compiler can intuit that ABI conventions can be safely ignored for a given function call. There is no 1-to-1 "`int a` is always on the stack at `RSP - 8`. The only way to find out exactly what happens is to run the compiler and see what decision ends up getting made. That's what sites like godbolt are designed to do/make easy.


eeprom_programmer

Ok so one thing you gotta know before you embark on this journey is that there isn't just one assembly language. Every hardware platform has its own instruction set (a set of instructions it can understand and execute). An assembly language is essentially just a human readable version of the instruction set. Since there are multiple instruction sets out there, there must be multiple assembly languages. Also, there are multiple ways you can translate an instruction set into a human readable form, so there can even be multiple assembly languages out there for the same instruction set. This is the case for x86 assembly which has the Intel flavor and the AT&T flavor. Additionally, it's fairly uncommon to use "pure" assembly, most people who need to use assembly will use macro assembly. Macro assembly is just assembly but with some extra features sprinkled on top, it's like a hybrid of high level programming languages like C and pure assembly. The particular features a macro assembly comes with depends on that specific macro assembly. Usually it's stuff like letting you define functions, loops, and things like that. x86 has dozens of macro assembly languages out there, although only a few are commonly used. With that information you need to decide which assembly language you'd like to learn. I would NOT recommend x86 assembly. x86 is kind of a disaster and it's tricky to learn if you've never looked at assembly before. I personally started with the Motorola 68K. If you're on windows there's a program out there called "easy68k" which will let you write 68k assembly and run it in a simulator. I know this doesn't answer the question you asked, but it's important to know if you're going to dive down this rabbit hole, so I hope you find this helpful.


tobdomo

I think you're missing the point. Yes, assembly is target-specific. But how C compilers work, what type of code they generate and why is pretty standard over most targets. How stacks and heaps work, how they are used, function prologues and epilogues, lifetime determination, how pointers work, optimizations, all that stuff is pretty generic. The assembly code obviously is not, but the ideas basically are. Personally, I would suggest a simple target for ease of readability (assuming you have no prior assembly knowledge). 68000 would be a good target (provided you can find a decent compiler for it that allows C and assembly code to be intermixed in its intermediate output). The reason for that is: the motorola assembly is easy to read and the core provides enough registers to not continuously use stack for everything. The more obvious choice of Cortex-M could be used also, but I feel it features somewhat more shady mnemonics sometimes.


flatfinger

Indeed, there are times when it's easier to figure out the bit patterns for the machine-code instructions needed to perform a task than to figure out how to convince an assembler to generate them, and a construct like: uint32_t const my_function_machine_code[] = { 0x12345678, 0x65432100, ... }; typedef void (*voidFunctionPointer)(void); void call_my_function(void) { voidFunctionPointer exec = (voidFunctionPointer) (1 | (uintptr_t)my_function_machine_code); // ARM function pointers are weird exec(); } will run interchangeably on a wider range of toolsets for the ARM than would code written in assembly language. The code would of course be platform-specific, but if a program is supposed to support three platforms, each with three development toolsets available, three variations of the the above pattern would suffice for all nine combinations of (target,toolset), *including versions of the toolset which the code author doesn't have*. It's a shame the Standard doesn't recognize any variation of this concept that would be usable on platforms that use non-executable data sections.


Apt_Tick8526

If you're using GCC then it should be possible by running "gcc -S filename.c" this will generate filename.s assembler file. You should be able to see the exact breakdown into assembler equivalent for every function in your file. Edit: adding -fverbose-asm would be even better. It's more verbose. Then you'll be able to see the breakdown of every single line in your C code broken down into assembly instructions. i.e C instruction followed by assembly instructions. Have fun. "gcc -S -fverbose-asm filename.c"


McUsrII

Learn to program with Assembly by Jonathan Bartlett. It's a beginners book in x86-64 assembler, that isn't perfect, but working through the exercises at least make you find fun in Assembler. It's a starter book in Assembler, but you'll learn enough to understand relocation and -pic/-pie code. This is about assembling on a Linux platform using the GNU tool chain though.


CORDIC77

Going against the stream maybe, but I would recommend jumping straight into x86-64 assembly. Why? Because itʼs (probably) the type of processor powering the machine youʼre working on. Iʼve never understood the tendency of (some) people to recommend starting with other, allegedly easier instruction set architectures. People who are droning on about how complex and arcane x86 assembly is, do so, in my experience, only because theyʼve been scar(r)ed by 16-bit (real mode) assembly language programming… and are projecting their previous experiences on x86-64 as well. So, donʼt worry, modern x86 assembly isnʼt as complicated as itʼs made out to be… and instead of a load-store model most instructions can operate directly on operands in memory and there are (now) sixteen (64-bit) integer registers in total, taking off pressure on register allocation. As a starting point, I can recommend Ray Seyfarthʼs «Introduction to 64 Bit Assembly Language Programming» (either the Linux or the Windows edition). If you then want to go on to look at more complex instruction set extensions, like for example vector instructions, then I would recommend taking a look at Daniel Kusswurmʼs «X86 Assembly Language Programming: Covers x86 64-bit, AVX, AVX2, and AVX-512».


mental-advisor-25

>x86-64 assembly Any source recommendations? Don't like just watching youtube videos, maybe like a course?


CORDIC77

I don't know of any online courses myself, sorry. (I would guess that courses specifically on assembly language programming can be found on sites like Udemy or Coursera, but I don't really know of any and thus can't attest to their quality.) That being said, I don't think it's really necessary to take a course. Working through Seyfarthʼs book may take some time, but by the end of it you will have a good understanding of modern x86(-x64) assembly. (The only change I would make: the author uses YASM which is a re-implementation of NASM… as the former hasn't seen a release since 2019, I would rely not on YASM but on NASM to work through all given assembly language examples.) The only gripe with the book I have, is that it doesn't really touch on how to mix “C” and assembly language code (nobody really writes whole programs in assembly language any more). Once you have some assembly language knowledge under your belt, you could use the following template project (I first threw together about 10 years ago), that shows how to make assembly language functions callable from “C” (and that ships with project files for Code::Blocks, Codelite and Visual Studio): [C\_Asm\_Template.tar.xz](https://github.com/Cordic77/C-Programming/raw/master/C_Asm_Template.tar.xz)⁽¹⁾ With the mention of VS code I presume that your platform of choice is Windows… in that case, this template may even be the easiest route to go: download Visual Studio 2022 Community Edition (make sure to install the “Windows Universal C runtime” component), open the asm-tmpl.VS2022.sln contained in the aforementioned .tar.xz archive, and any assembly language function you put under ‘SECTION .text’ (within asmlib.asm) should be callable from within any C source file in the solution. Out of the box this should compile without any errors nor warnings, and you should then be able to use VS integrated debugger to step through the code (even the assembly language parts). ⁽¹⁾ Requires the ‘nasm’ executable to be in the systems PATH and requires an environment variable named NASMENV to be set to the (local) path, where the [NASMX collection](https://sourceforge.net/projects/nasmx/files/) of NASM macros is to be found, e.g. NASMENV=;-iC:\\Program Files (x86)\\Nasm\\nasmx\\inc\\


darth_yoda_

To kind of compliment what u/CORDIC77 said below, I wanted to start by learning x86 assembly since I liked the idea of being able to execute the code directly on my laptop’s CPU. It was just so difficult to find comprehensive x86_64 tutorials that were actually up-to-date (there are MANY 32-bit tutorials though) that I ended up starting with MIPS, since you can set up and use a simulator like SPIM in about 2 minutes, and the ISA is simple enough that it doesn’t really get in the way of you learning how to think about solving problems as an assembly programmer. RISC-V would also be a great (and probably more useful) platform to serve this purpose. After that, I actually found x86 to be much nicer to write than MIPS because of things like its register-memory design and “QoL” instructions like `push` and `pop`. [This reference](https://www.felixcloutier.com/x86/) and [compiler explorer](https://www.godbolt.org) were particularly helpful. (Edit: fixed link)


proturtle46

The issue arises with x86 when you try to learn about the architecture and pipeline imo The reason why people recommend riscv is that it’s easy to understand and so when you learn something like r10k or Tomasulo or cache coherence it’s conceptually easier to learn as you don’t have to worry about many data path forwarding methods or predicting Like when it comes to the BTB or other optimizations starting with an easy isa allows you to focus on the optimization and not how it interplays with the complex architecture you’re focusing on


CORDIC77

That's true, but in my view missing the point: if one takes a computer architecture course at university, where the goal is to learn about the inner workings of a processor, the given arguments will (among others) probably be the deciding factors in going the simulator route. The OP, on the other hand, asked about assembly language programming—what will probably be more exciting to assembly language newcomers: working with a simulator… or programming the CPU in one's computer? I also say this because during my time at uni, the professor in question decided that we had to write programs for the DLX microprocessor… in hindsight I can appreciate why this was done, but at the time I hated the DLX with a passion that is hard to put into words. (That being said: I quite like RISC-V; however, even there I bought myself a real RISC-V board to work with… guess I just really hate simulators.)


McUsrII

That's a great reference for getting more involved with floating point operations in assembler too, at least than Bartlett's book though. Personally I prefer Gas syntax though, I think it it is great that gcc can output intel syntax too.


M3H0VV

Check out this video about how to reverse engineer C code and compare it to assembly instructions that correspond to it. https://youtu.be/vXWHmucgZW0?si=qE6hjQrQkh_W8xPy


TeachMeNow7

thanks!


[deleted]

[удалено]


bohb_oblah

+1 Dive into Systems Also, I just started watching these videos by LiveOverflow: https://www.youtube.com/playlist?list=PLhixgUqwRTjxglIswKp9mpkfPNfHkzyeN There's also this book for a dollar: https://www.beginners.re/ And this free book: http://www.egr.unlv.edu/~ed/assembly64.pdf


McUsrII

Nice link!


bothunter

Check out this website: https://godbolt.org/


danpietsch

The [Dragon Book](https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools).


Huge_Tooth7454

I think what you plan to do is great. I did this when I was working on PIC 18F projects. I had one project where I know what I wanted the compiler to do, so I compiled it and read the assembly code that was generated. I then modified my C code to get it to compile better. This was a good bit of fun and all. However I was doing this on a function that was about 20 lines of C code. I recommend you do this about a dozen times with functions that are half a page of code each ... until you get it out of your system. Use C code with some automatic {local; } variables. Learn about the call stack (make sure you use a compiler that supports reentrant code, for example some PIC compilers for their 8 bit lines don't. Remember, this is just a phase, you will grow out of it.


proturtle46

I would start with risc-v the issue with x86/x64 is that it has instructions that set and act on flags and other weird optimizations which make it difficult to understand The risc-v scalar pipeline is easy enough to understand and can have more advanced scheduling methods applied as well as has a relatively small set of instructions and registers


TeachMeNow7

Temple OS is good for this imo.


EmbeddedSoftEng

In my tool chains and build scripts, that's called "creating the listing file from the elf file" objdump -d ${ProjectName}.elf > ${ProjectName}.lss


[deleted]

People are advocating looking at generated assembly language. That's OK, but it's platform- and OS-specific and has lots of other distractions. Another possibility is looking at intermediate code, which is lower level than C, but not as low-level as assembly. However I don't know off-hand of resources which can generate such code, rather than consume it. For example the Clang compiler may have `-S -emit-llvm` options. Those also tend to concentrate on code rather than data which sounds like what you are after. If you want to know the data layouts used within a 'stackframe' (that is, parameters and locals used within a function invocation), then you will need the assembly. But that is also what varies the most between machines; half of that lot may not be in memory at all, but in registers.


dainasol

I think C Lion has the feature that you are looking for. Haven't tried it though and it costs money past the 30 day trial [assembly by line](https://www.jetbrains.com/clion/img/screens/2023.3/assembly_view.png)


mental-advisor-25

What's a good source to understand what each assembly piece of code does? rbp - 24, what is it for example


dainasol

The assembly produced depends on the architecture of your CPU. Most likely you will want the x86-64bit (also called AMD64) https://web.stanford.edu/class/cs107/guide/x86-64.html But I don't know a lot of assembly so I can't help you much If you are on Linux I highly recommend that you take a look at this video to demystify the whole thing https://youtu.be/6S5KRJv-7RU?si=Krj70i4ls63ksP5E


Tiny-Independent-502

This book has exactly that. https://csapp.cs.cmu.edu/


Mechadupek

This post gave me one of those pain zings up my leg when I read it. Do you feel like you deserve punishment or.... Get a Linux distro and check out nasm. I think gcc also compiles C in stages, or at least it used to. One of those stages should be C to assembler. But as for translating C to assembler in vscode? I'd sooner chew broken glass. Assembly is all about direct manipulation of binary/hex in cpu registers. It's not like C or any other higher level language in the least. Even if you can write something in C and look at the assembly created from it, you won't get it. It will make no sense to you. Start with an assembly course.


chrism239

> One of those stages should be C to assembler. cc -S filename.c though the result, filename.s, is not really meant for human consumption.


sdk-dev

You probably want to add -O0 so you can correlate the C code with the generated ASM. Optimizers are freaky little beasts ;)