Posts

The Art of Profiling and Benchmarking: Making Software Faster on AArch64

 As the SPO600 course end nears, I have learned a lot about profiling and benchmarking, important skills for making software faster. This blog post will talk about these topics and what I learned in the course. What is Profiling and Benchmarking Profiling  is looking at a program to see which parts use the most resources like CPU time and memory. Profiling helps to find slow parts that need fixing. Benchmarking  is running tests to measure how fast a program is. This helps compare different versions of the software or different ways of doing things. Tools for Profiling and Benchmarking In the course, we use some tools to profile and benchmark our code. These include: perf:  A powerful tool for profiling on Linux. It gives details about CPU use, cache hits, and more. gprof:  A GNU tool that shows time spent in each function. Valgrind:  Mostly for finding memory leaks, but also has a profiler called  callgrind  for looking at function calls. Steps i...

Project Stage 3: Integration, Tidy, & Wrap

  Introduction Stage 3 of the SPO600 Summer 2024 project focused on integrating all components, resolving outstanding issues, and ensuring a functional proof-of-concept for the GCC AFMV (Automatic Function Multi-Versioning) feature. My specific tasks involved updating the GCC documentation for IFUNC (Indirect Functions) and FMV (Function Multi-Versioning), as well as documenting the new AFMV feature. My Tasks and Progress 1. Updating GCC IFUNC Documentation: Overview: Expanded explanations on IFUNC mechanisms, emphasizing how it allows runtime selection of different function implementations based on CPU capabilities. Syntax and Examples: Provided detailed syntax for declaring IFUNCs and writing resolver functions. Included examples for both x86 and ARM architectures to illustrate usage. Architecture-Specific Details: Described specific considerations for x86 and ARM architectures, such as efficient resolver function implementation. Best Practices: Highlighted the importance of maki...

Project Stage 3: Beginning and Planning

  Introduction As we move into Project Stage 3, our focus is on integrating everything we’ve worked on so far, ironing out any issues, and making sure our proof-of-concept for the GCC AFMV (Automatic Function Multi-Versioning) feature works seamlessly. This stage is crucial for bringing the project to a successful close. My Tasks Update Documentation for GCC IFUNC and FMV: Goal:  Improve the current documentation to reflect the latest implementations for different architectures. Plan: Gather detailed information on the current implementations. Expand explanations, add examples, and highlight best practices. Ensure the documentation is comprehensive and easy to understand. Document the New AFMV Feature: Goal:  Create detailed documentation for the newly implemented AFMV feature. Plan: Introduce AFMV, its purpose, and its benefits. Provide clear usage instructions and examples. Highlight how AFMV makes it easier to use processor features without changing the source code. Pl...

Optimizing Software for AArch64 Architecture: An In-depth Exploration

As my SPO600 course nears its end, I've had the opportunity to delve deeply into various aspects of software optimization and porting, particularly for the AArch64 architecture. This blog post aims to summarize the key learnings and practical insights gained from this journey. Introduction to AArch64 AArch64, introduced as part of ARMv8, is the 64-bit execution state of the ARM architecture. It offers several advantages over its predecessors, including improved performance, enhanced security features, and support for large address spaces. Understanding the nuances of AArch64 is crucial for optimizing software to leverage its full potential. Binary Representation and Endianness One of the foundational topics we covered was the binary representation of data and endianness. AArch64, like many modern architectures, supports both little-endian and big-endian modes, although little-endian is the default. This flexibility allows AArch64 to maintain compatibility with various data formats ...

Reflections and Conclusions After Completing Project Stage 2 - Enhancing GCC IFUNC and FMV Documentation

  Introduction As I wrap up Stage 2 of the project, focusing on updating the GCC documentation for IFUNC (Indirect Functions) and FMV (Function Multi-Versioning), I am filled with a sense of accomplishment and satisfaction. This stage has been an insightful journey, deepening my understanding of GCC's capabilities and the importance of clear, comprehensive documentation. In this blog post, I will share my experiences, insights, and conclusions from this stage of the project. Understanding the Task At the onset, my primary goal was to enhance the documentation for GCC's IFUNC and FMV features. These features are crucial for developers aiming to optimize their software for a wide range of processors, each with its unique set of capabilities. The challenge was to create documentation that is not only accurate and detailed but also accessible and easy to understand for developers of all skill levels. Key Learning Points IFUNC (Indirect Functions) : IFUNC allows developers to write ...