X86_64 IDE for MAC (rebirth of my Multi Emulator)

codestarman

4.71/5 (8 votes)

Jan 4, 2018

GPL3

10 min read

18255

261

This project describes an X86 assembler IDE for the MAC developed using JavaFX. The starting point was an X86 emulator developed by the author in C++, which was subsequently ported to C#.

Download FinalIDE.zip (Java source, Javadoc and .jar executable) - 146.7 KB

Final Update

The attached files (see ZIP) now include the Java source directory, Javadoc directory and executables. This represents the final version of the application, with all the intended functionality. This latest edition covers:

the ability to set a default Path for all files (work, log and data),
a facility to save to file the source, object, executable file names and the scripts in use to avoid having to keep re-entering them between sessions
scrollbars to facilitate display on smaller screens.
simple HELP page with references.

Appendix C provides a screenshot of the final version in use.

Revision

Source files and a JAR executable are now included for the application illustrated in Appendix A. Prerequisites are:

Java runtime must be installed
this IDE application is currently configured for execution on a MAC
the source included was developed using Netbeans and Eclipse IDEs with SceneBuilder for JavaFX graphics development
NASM, GNU linker LD, LLDB (run using the BASH Terminal app), Affinic Debugger (free version) and open source Brackets editor must be installed and accessible from the account in which the executable is run
Source, object and executables referenced when running the application must include their full Path. The Paths used in the built-in scripts for Brackets, NASM, etc may need to be changed in the source and the application rebuilt if 'file not found' errors occur
Various work files are created by the application in the local directory (log files, text files, etc). These files have generic names and naming conflicts are possible.
the source files referenced must be created and save using Brackets or another editor before they are referenced for assembly/linking
source files must be in the format filename.asm and object files filename.o.

Introduction

In a previous article (X86/ARM Emulator) I described a project to develop an X86 assembly language emulator. At the time the article was published I had completed a C++ application that was 90% working for X86 code emulation. Other functionality, such as handling ARM and X64 code, was included only as outline frameworks. This follow-on article looks at subsequent developments, first widening the functionality of the original application, porting it to C# and then a complete sidestep into an assembler IDE written in JavaFX for the MAC.

The original project was a learning exercise to develop my C++ skills and at the time the article was written I had met my objectives. However, positive feedback encouraged me to consider a path for further development. On recommencing development, I discovered from Microsoft's response to a Visual Studio bug I had reported that they had effectively deprecated the use of C++ to develop Windows Forms UIs. My problem report had arisen from trying to port the application development a newer version of Visual Studio. I looked at options for new learning that resolved this issue and came to a decision to port the application to C#. The downside of this was that I could no longer use the original inline assembler code or readily employ the X86 assembler modules linked into the original C++ application. The upside was that C# seemed to a novice to offer ease of use, a wide range of library classes, a syntax which seemed easier to remember, and full separation of designer and user code using partial classes.

Appendix B shows some screenshots of the C# application at the point where it could handle a wide range of small X86 applications (minus System Calls) and was also capable of recognising some X64 and ARM code. Outline functionality was in place to add .NET IL and Java Bytecode. It was at this point that the sidestep occurred. In order to support my other hobby of photography, and to overcome my frustration with Windows 8, I decided to replace my Windows PC with an Apple iMac. At first, I tried to continue my C# skills development running Windows 10 within various VMs. Whilst this was just about practicable, the whole process proved slow and cumbersome and regular upgrades to Windows and Visual Studio generated delays, performance issues and challenges with my code that proved time-consuming. This made me start thinking about a change of tack for my learning, perhaps using Xcode to develop MAC OS or IOS applications. Unfortunately, experiments with Xcode proved just as frustrating, both because I was faced with a new language, Swift, and a new graphics framework, but also because every new version of Xcode seemed to have performance issues on my iMac that as yet I have failed to find a permanent fix for. This had me thinking about a universal language, Java. Its syntax is very like C#, and JavaFX using SceneBuilder has many similarities with Windows Forms designer. Perhaps more significantly, I failed to find an assembler IDE designed for the MAC platform (other than Xcode which is hardly user-friendly in that role). Rather than develop my skills through the coding of an emulator, which is really only useful as a training aide, I realised I could try my hand at a piece of software that for me had real practical value, the X86 Assembler IDE described in this article.

Background

This project started mostly as a learning vehicle rather than a finished product. I had been struck by the rebirth of interest in the nuts and bolts of IT, for example, enthusiasm for the Raspberry PI, and the recent availability of free or low cost, online software engineering courses by Coursera and others. Clearly, there is a thirst for learning that is quite divergent from the demand for plug and play that characterises consumer acquisitions like tablets, smartphones, and gaming consoles. However, with the project goal revised from the production of an emulator to an IDE, the resulting software tool has the potential to offer real practical value, at least for me.

MAC IDE Project

In my professional career, which spanned Data Centre operation, programming, systems analysis, project management through to director roles, the learning curve for each new role was often steep. I quickly understood my strengths and weakness, especially as regards to memory. Nearly 50 years on I can still remember hexadecimal machine instructions for ICL System 4 mainframes, but the detailed syntax of Bash command line options or high-level programming constructs has a way of escaping me. Thankfully, the Internet has replaced massive reference works and crib sheets. There are many out there who can happily interact with Unix and Bash shells to accomplish everything. They would certainly be able to use the MAC OS Terminal app to accomplish everything needed for X86 program development. Personally, I am more of a Visual Studio, Netbeans and MASM man. The more visual and intuitive the UI, the better I like it. Using JavaFX it is not difficult to produce an application that acts as a visual UI for Terminal, NASM, Brackets, etc, such that everything can be seen in one place and command line instructions can be scripted to ease the load on one's memory. Appendix A provides screenshots of the JavaFX based application running on a MAC. The assembler, linker and generated executables run as Terminal commands under the control of the Java IDE. The current version targets X86 code for execution on a MAC, but it would not be difficult to produce a version for Windows or to add X64 capabilities, as the underlying functionality is provided by Brackets (source editor), NASM (assembler), native Linker and the relevant OS (running executables). The main omission is a debugger that could be operated under the control of the IDE. Xcode is one possibility but that rather takes me back to square one. Running on Windows, I would employ Notepad++ as the editor and OllyDbg for the as yet undeveloped debugging functionality. Learning Java, JavaFX graphics, use of Netbeans, NASM and Bash takes some time, mostly in how to interpret and rectify errors. So progress is slow and I have made no attempt as yet to write compact, robust code. Error handling, like documentation, only comes to those that wait. However, I nearly have a piece of software that has its intended first phase user functionality complete and which I can use to understand the niceties of X86 assembler coding on a MAC. The next step is to understand how to add debugging functionality. Additional scripting to tackle X64 code should not prove onerous.

Sharing

This short article is intended as a taster. I intend to complete, test and make the source code available via Github, with a follow-up article to alert those interested.

Appendix A

An X86 assembler IDE for the MAC written in JavaFX

On starting the application a single Java scene window appears (Screen 1 below), with prompt text and default names for source and object files. For the application to run as intended, the Brackets editor, Affinic and LLDB debuggers, NASM and the linker LD need to be installed on the machine. The source, object and executable files need to be specified with their full paths as these are utilised by the generated scripts. In a finished version, it would be neater to allow Path values to be set on initial startup in the same way that IDEs usually ask for preferred locations for work files etc. To date, the only issue has been the time it takes for NASM to complete (about a minute regardless of whether started via the IDE or by hand in Terminal).

The use of the software should be reasonably intuitive. The user enters the source and/or object file details and ticks the entries to be used. When the Brackets button is clicked that application loads with the relevant source file open. When the source editing is complete the user saves the file within Brackets and clicks list/refresh to display the latest version of that source file within the IDE window. If the source is satisfactory, the user clicks Run NASM and when that process completes the NASM assembler listing and object file generated are in turn displayed in the IDE window. Any errors or feedback written to the Terminal window are captured by the IDE and displayed. If the object file is correctly generated the user can then either assemble another object file or click the Link button to generate an executable from one or more object files. The correct object file details need to be entered if not already shown and selected using the Incl buttons. The entry point name for the 'main' routine also needs to be entered if it is not the default. Finally, the executable can be run and the output is seen in the Terminal window (at this stage only console output executables are being targetted). Screen 2 shows an example of the IDE window after executing these steps. Basic debugging facilities can be linked to from the IDE but they are not fully integrated.

Screen 1 (test version on loading)

Screen 2 (output during testing, the final UI can be seen in Appendix C)

Appendix B

An X86 etc Emulator for windows written in C# with Windows .Net Forms

Startup Screen

Main Menu Screen (normally use flows from source editing to displaying machine state and running object code)

Entering Source Code and Parsing (as an alternative to the internal editor, Notepad++ can be launched and the resulting file captured)

Capturing source code from Notepad++

Examining generated object code (a full assembler listing can also be produced)

Examining the machine state

How the Application Works

The user interface comprises a series of Windows Forms that allow the user to enter the code, parse and execute it.

One of the 'helper' classes, that do the work behind these forms, is Lexit (lexical analysis and parsing). This takes the source text, splits it into discrete words (tokens), converts these tokens into numeric values discarding comments and whitespace, and then parses the token stream using a set of rules. For example, this code outline illustrates the use of those rules:

// set up loop to parse tokens held as an array of numeric values, comparing the token stream against rules (legal sentences in the grammar)

// each rule comprises: the length of the 'sentence' and then the set of allowed tokens for that legal sentence

// token encoding examples: 238 = lang type X86, 80 =opcode, 64 = 32 bit register, 10 = immediate value, 248 = operand in memory, 116 = instruction label (reference this locn in code for jumps etc)

UInt32[] rule0 = new UInt32[2] { 1, 238 };   // example rules/sentences
UInt32[] rule1 = new UInt32[2] { 1, 80 };
UInt32[] rule2 = new UInt32[3] { 2, 80, 64 };
UInt32[] rule3 = new UInt32[4] { 3, 80, 64, 10 };
UInt32[] rule4 = new UInt32[4] { 3, 80, 64, 64 };
UInt32[] rule5 = new UInt32[4] { 3, 80, 64, 248 };
UInt32[] rule6 = new UInt32[5] { 4, 116, 80, 64, 64 };
UInt32[] rule7 = new UInt32[5] { 4, 116, 80, 64, 248 };
UInt32[] rule8 = new UInt32[5] { 4, 116, 80, 64, 10 };

// the token stream is tested against each possible rule, more demanding rules first, simplest last to avoid false positives           

switch (ruleno) 
{
	case (8): result = OK = tryRule(ruleno, rule8, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule8[0]; } goto Lnxttok;

	case (7): result = OK = tryRule(ruleno, rule7, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule7[0]; } goto Lnxttok

	case (6): result = OK = tryRule(ruleno, rule6, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule6[0]; } goto Lnxttok;
	case (5): result = OK = tryRule(ruleno, rule5, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule5[0]; } goto Lnxttok;
	case (4): result = OK = tryRule(ruleno, rule4, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule4[0]; } goto Lnxttok;
	case (3): result = OK = tryRule(ruleno, rule3, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule3[0]; } goto Lnxttok;
	case (2): result = OK = tryRule(ruleno, rule2, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule2[0]; } goto Lnxttok;
	case (1): result = OK = tryRule(ruleno, rule1, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule1[0]; } goto Lnxttok;
	case (0): result = OK = tryRule(ruleno, rule0, G.token, G.tokentype, i, fLookmax); if (OK) { skip = skip + rule0[0]; } goto Lnxttok;
	default: G.writer.WriteLine("X86 Syntax error line " + G.lineno); i++; skip++; result = OK = false; goto Lendline;

}

NOTE: More compact and efficient code is obvious, but the aim was to allow easy debugging before coding for efficiency. 

// once a match is found for a set of tokens comprising a source line the process skips to the next set (source line)

// The grammar for the X86/X64 parser is described by the outline below.Floating point, MMX, etc instructions are not covered.           


//  Statement'        ->[Statement]|[InstrStatement]|[DataDefnStmnt]|[Comment]|[Directive]
//  InstrStatement ->[Label][Prefix][ValidInstrMneu][Operand1][Operand2][Operand3][Comment]
//  DataDefnStmnt->[DataLabel][ValidDefn][DataValue][Comment]
//  AddrLabel        ->[NearLabel]|[FarLabel]|e
//  DataLabel        ->[VariableName]|e
//  ValidInstrMneu->["add"|"sub"|...............]+
//  ValidVarDefn   ->["byte"|"word"|"dword"]+
//  ValidConstDefn->["equ"]+
//  VariableName  ->[a-zA-Z]+[0-9]
//          NearLabel ->[a-zA-Z][:]
//          FarLabel    ->[a-zA-Z][:][:]
//  ValidRegister   ->["eax","ebx"..................."eip"]+
//  Prefix               ->["rep"|"lock"...................]e
//  Operand1        ->[SizeModifier][ValidExpression]|[ValidExpression]|[SizeModifier][DataLabel]|[AddrLabel]|[ValidImmediate]|e
//  Operand2        ->[SizeModifier][ValidExpression]|[ValidExpression]|[SizeModifier][DataLabel]|[AddrLabel]|[ValidImmediate]|e
//  Operand3        ->[SizeModifier][ValidExpression]|[ValidExpression]|[SizeModifier][DataLabel]|[AddrLabel]|[ValidImmediate]|e
//  Comment        ->[;]+[a-cA-Z0-9]|e
//  SizeModifier    ->["byteptr","wordptr","dwordptr"]+
// ValidExpression->[ValidRegister]|Displ[Base+Index*Scale]|Base[Displ]
//           Displ       ->[ValidImmediate]
//           Scale       ->[0-9]+|e
//           Base        ->[ValidRegister]
//           Index       ->[ValidRegister]|[0-9]+|e
//  Directive           ->to be defined..................
//  NB. Allows [base+displ] as alternative to displ[base]  displ[base+index*scale] or [base+index*scale]   -                //   *scale may be omitted

The successfully parsed code is stored in a table using the source line number as the key. Other 'helper' classes can be invoked from the main menu to execute (emulate) these parsed lines of code or display them in machine code form line by line or as a full object listing. Emulated execution can be observed using the forms that display registers, memory, stack, etc. The generation of relocatable object code capable of external execution was not included in the original functionality.

Appendix C

Final UI for X86_64 IDE for MAC